Ukumiswa kuyindlela yokwebiwa kwedatha enikeza izigaba ekuqoqweni kwedatha ukuze kutholakale ukubikezela okunembile nokuhlaziya. Futhi okuthiwa ngezinye izikhathi kuthiwa yi- Decision Tree , ukuhlelwa kwezinhlelo kungenye yezindlela ezimbalwa ezihloselwe ukwenza ukuhlaziywa kwamadokhumenti amakhulu kakhulu kuphumelele.
Kungani Kunqunywa?
Ulwazi olubanzi kakhulu luba yinto evamile ezweni lanamuhla le "idatha enkulu." Cabanga nje nge-database ene-terabyte eminingi yedatha-i-terabyte iyinkulungwane ye- trillion yedatha yedatha.
I-Facebook yedwa iqoqa ama-terabytes angu-600 yedatha entsha njalo ngosuku olulodwa (kusukela ngo-2014, isikhathi sokugcina lapho kubika lezi zici). Inselelo enkulu yedatha enkulu yindlela yokuyiqonda ngayo.
Futhi ivolumu ayiyona yodwa inkinga: idatha enkulu nayo ibuye iguquke, ingakhiwe futhi iguquke ngokushesha. Cabanga idatha yomsindo nevidiyo, okuthunyelwe kwezokuxhumana, idatha ye-3D noma i-geospatial. Lolu hlobo lwedatha aluhlelwa kalula noma oluhlelekile.
Ukuze uhlangabezane nale nselele, izindlela eziningi ezizenzekelayo zokukhipha ulwazi oluwusizo zakhiwe, phakathi kwazo ngezigaba .
Indlela Ukuhlukaniswa Kusebenza Kanjani
Ngengozi yokuhamba kakhulu ekukhulumeni kwe-tech, ake sixoxe ngokuthi ukuhlukaniswa kusebenza kanjani. Umgomo ukudala iqoqo lemithetho yokuhlukanisa ezophendula umbuzo, ukwenza isinqumo, noma ukubikezela ukuziphatha.Ukuqala, iqoqo lemininingwane yokuqeqeshwa lakhiwe eliqukethe isethi ethile yezimfanelo kanye nemiphumela engenzeka.
Umsebenzi we-algorithm yokuhlukanisa ukukala ukuthi ukuthola ukuthi lezi zimfanelo zifinyelela kanjani ekuphethweni kwalo.
Isimo : Mhlawumbe inkampani yekhadi lesikweletu lizama ukunquma ukuthi yiziphi amathemba okufanele zithole ukunikezwa kwekhadi lesikweletu.
Lokhu kungase kube isethi yayo yedatha yokuqeqesha:
Igama | Ubudala | Ubulili | Imali Yonyaka | Isipho Sekhadi Lesikweletu |
---|---|---|---|---|
John Doe | 25 | M | $ 39,500 | Cha |
Jane Doe | 56 | F | $ 125,000 | Yebo |
Amakholomu "e-predictor" Ubudala , Ubulili , kanye Nemivuzo Yonyaka inquma ukubaluleka kwe-"predictor attribute". Esikhathini sokuqeqesha, isichazamazwi se-predictor siyaziwa. I-algorithm ye-classification yabe izama ukunquma ukuthi ukubaluleka kwesici sokuqagela kwafinyelelwa kanjani: yikuphi ubudlelwane obukhona phakathi kokubikezela nesinqumo? Izokwakha isethi yemithetho yokubikezela, ngokuvamile isitatimende se-IF / THEN, isibonelo:
I-IF (Ubudala> 18 NOMA Ubudala <75) NENYAKA YONYAKA YONYAKA> 40,000 KUNYE I-Credit Card Offer = yebo
Ngokusobala, lokhu yisibonelo esilula, futhi i-algorithm izodinga isibalo esikhulu sedatha kakhulu kunalawo marekhodi amabili aboniswe lapha. Ngaphezu kwalokho, imithetho yokubikezela kungenzeka ibe yinkimbinkimbi kakhulu, kufaka phakathi imithetho emincane yokuthola imininingwane yemfanelo.
Okulandelayo, i-algorithm inikezwa "isethi yokubikezela" yedatha yokuhlaziya, kodwa lokhu kusethelwa ukuthi akusikho isici sokubikezela (noma isinqumo):
Igama | Ubudala | Ubulili | Imali Yonyaka | Isipho Sekhadi Lesikweletu |
---|---|---|---|---|
UJack Frost | 42 | M | $ 88,000 | |
UMary Murray | 16 | F | $ 0 |
Le datha yokulungisa idinga ukulinganisa ukunemba kwemithetho yokubikezela, futhi imithetho isuke isetshenziswe kuze kube yilapho umthuthukisi ebheka ukuthi izibikezelo ziyasebenza futhi ziwusizo.
Usuku nosuku Izibonelo zokubekwa
Ukuhlukaniswa, kanye namanye amasu okumba amaminithi wedatha, kubangelwa ukuhlangenwe nakho kwethu kwansuku zonke njengabathengi.
Ukubikezela kwesimo sezulu kungase kusetshenziswe ukuhlukanisa ukubika ukuthi usuku luzokanda, lube lusuku noma lube lukhulu. Umsebenzi wezokwelapha angase ahlaziye izimo zempilo ukubikezela imiphumela yezokwelapha. Uhlobo lwenqubo yokuhlukanisa, i-Naive Bayesian, isebenzisa amathuba okuhlukanisa ama-imeyli ogaxekile. Kusukela ekutholeni ukukhwabanisa ekuhlinzekeni komkhiqizo, ukuhlukaniswa kwemiphumela kulandela izigcawu nsuku zonke ukuhlaziya idatha nokukhiqiza izibikezelo.