Yini i-Optical Character Recognition (OCR)?

I-Optical Character Recognition (OCR) ibhekisela kwisofthiwe edala inguqulo yedijithali yedokhumenti ephrintiwe, ephepheniwe noma ebhalwe ngesandla engayifunda ngaphandle kwesidingo sokuthayipha ngesandla noma ukufaka umbhalo. I-OCR isetshenziselwa kakhulu kumadokhumenti askenwe ngefomethi ye- PDF , kodwa ingenza futhi inguqulo efundekayo ikhompyutha yombhalo ngaphakathi kwefayela lesithombe.

Iyini i-OCR?

I-OCR, ebizwa nangokuthi ukubonakala kombhalo, ubuchwepheshe be-software obuguqula izinhlamvu ezifana nezinombolo, izinhlamvu, nezimpawu zokubhala (ezibizwa ngokuthi ama-glyphs) ezincwadini eziphrintiwe noma ezibhaliwe zibe ifomu le-elekthronikhi elibhekwa kalula futhi lifundwa ngamakhompyutha nakwezinye izinhlelo zesofthiwe. Ezinye izinhlelo ze-OCR zenza lokhu njengoba idokhumenti iskena noma ifakwe ngekhamera yedijithali kanti abanye bangasebenzisa le nqubo kumadokhumenti asekhishwe ngaphambilini noma abonwe ngaphandle kwe-OCR. I-OCR ivumela abasebenzisi ukuthi bafune ngaphakathi kwamadokhumenti e-PDF, hlela umbhalo, futhi bafake kabusha amadokhumenti.

Yini i-OCR eyayetshenziselwa?

Ukuze uthole okusheshayo, izidingo zansuku zonke zokuskena, i-OCR kungenzeka ukuthi ayiyona into enkulu. Uma wenza inani elikhulu lokuskena, ukwazi ukucinga phakathi kwama-PDF ukuthola okuqondile okudingayo kungalondoloza isikhathi esithile futhi kwenza ukusebenza kwe-OCR ohlelweni lwakho lokuskena kubaluleke nakakhulu. Nazi ezinye izinto i-OCR ezisiza nge:

Kungani usebenzisa i-OCR?

Kungani ungavele uthathe isithombe, akunjalo? Ngoba ngeke ukwazi ukuhlela noma ukucinga umbhalo ngoba kungaba nje isithombe. Ukuskena idokhumenti nokusebenzisa isofthiwe ye-OCR kungenza le fayela ibe into ongayenza futhi ukwazi ukuyihlola.

Umlando we-OCR

Ngesikhathi ukusetshenziswa kokuqala kombhalo kuqhathaniswa no-1914, ukuthuthukiswa okubanzi nokusetshenziswa kobuchwepheshe obuhlobene ne-OCR kwaqala ngobuqotho ngawo-1950, ikakhulukazi ekudalweni kwamafonti alula kakhulu okulula ukuguqula umbhalo ofundeka ngamakhodi. I-first of these fonts elula yenziwa nguDavid Shepard futhi owaziwa ngokuthi i-OCR-7B. I-OCR-7B isasetshenziswa namuhla embonini yezezimali ukuze ifonti ejwayelekile isetshenziswe kumakhadi esikweletu kanye namakhadi we-debit. Ngama-1960, izinsizakalo zokuthumela emazweni amaningana zaqala ukusebenzisa ubuchwepheshe be-OCR ukusheshisa kakhulu ukuhlelwa kweposi, kuhlanganise ne-United States, Great Britain, Canada, naseJalimane. I-OCR namanje ubuchwepheshe obuyisisekelo obusetshenziselwa ukuhlunga i-mail yezinsizakalo zeposi emhlabeni jikelele. Ngonyaka ka-2000, ulwazi olusemqoka lwemikhawulo namakhono we-OCR ubuchwepheshe lusetshenziselwa ukuthuthukisa izinhlelo ze-CAPTCHA ezisetshenziselwa ukumisa ama-bots kanye nogaxekile.

Emashumini eminyaka, i-OCR iye yakhula ngokunembile futhi eyinkimbinkimbi ngenxa yentuthuko ezindaweni eziphathelene nobuchwepheshe ezifana nokuhlakanipha okusebenzayo , ukufundwa komshini , nokubona komshini. Namuhla, isofthiwe ye-OCR isebenzisa ukuqaphela iphathini, ukutholakala kwesici, nokumbumbelwa kombhalo ukuguqula amadokhumenti ngokushesha futhi ngokunemba kunanini ngaphambili.