Kuyini k-kusho ukuhlanganisa?

Imayini ye-data ne-k-kusho i-algorithm

I-k-ehlanganisa ukuhlanganiswa kwe-algorithm yindlela yokumba izimayini kanye nethuluzi lokufunda lomshini elisetshenziselwa ukuhlanganisa ukubonwa ngamaqembu okubuka okuhlobene ngaphandle kolwazi olungaphambili lwalobo buhlobo. Ngokwesampuli, i-algorithm izama ukukhombisa ukuthi yisiphi isigaba, noma iqoqo, idatha ikhona, nenani lamaqoqo echazwa ngenani k.

I-k-means algorithm ingenye yezindlela ezilula zokuqoqa futhi ngokuvamile isetshenziselwa ukucabanga ngezokwelapha, i-biometrics, nezindawo ezihlobene. Inzuzo ye- k- ukusho ukuhlanganisa ukuthi ikhuluma ngemininingwane yakho (usebenzisa ifomu layo elingagunyaziwe) kunokuba ufundise i-algorithm mayelana nedatha ekuqaleni (usebenzisa ifomu eliqondisiwe le-algorithm).

Ngezinye izikhathi kuthiwa yi-Lloyd's Algorithm, ikakhulukazi emibuthanweni yesayensi yekhompiyutha ngoba i-algorithm ejwayelekile yahlongozwa okokuqala nguStuart Lloyd ngo-1957. Igama elithi "k-means" lahlanganiswa ngo-1967 nguJames McQueen.

Indlela i-k-isho ngayo imisebenzi ye-Algorithm

I-k-means algorithm iyi-algorithm yokuziphendukela kwemvelo ezuza igama layo kusukela ekusebenzeni kwayo. Iqoqo le-algorithm libhekwa ngamaqembu e- k , lapho ihlinzekwa njengepharamitha yokufaka. Yabe isabela yonke imibiko kumaqoqo esekelwe ekuqapheliseni kokubona okushiwo iqoqo. Iqoqo laleli qembu liphindiswa futhi inqubo iqala futhi. Nakhu ukuthi i-algorithm isebenza kanjani:

  1. I-algorithm ikhetha amaphuzu k njengezikhungo zokuqala zama-cluster (izindlela).
  2. Iphuzu ngalinye kule datasethi linikezwa ibhokisi elivaliwe, ngokusekelwe ebangeni eli-Euclidean phakathi kwephuzu ngalinye nesikhungo ngasinye seqoqo.
  3. Isikhungo ngasinye seqoqo sibuyiselwa njengokulinganisa kwamaphoyinti kulelo qembu.
  4. Izinyathelo ezimbili no-3 ziphinda zize zihlangane. Ukuguqulwa kungachazwa ngokuhlukile ngokuya kokusebenza, kepha kuvame ukuthi kusho ukuthi akukho ukubuka okushintsha amaqoqo lapho izinyathelo ezimbili no-3 ziphindaphindiwe, noma ukuthi izinguquko azenzi umehluko wezinto ezibonakalayo ekuchazeni amaqoqo.

Ukukhetha inani lamaCluster

Enye yezinkinga ezinkulu ku- k- ukusho ukuhlanganiswa yiqiniso lokuthi kufanele ucacise inani lamaqoqo njengengalo ku-algorithm. Njengoklanyelwe, i-algorithm ayinakukwazi ukunquma inani elifanele lamaqoqo futhi kuncike kumsebenzisi ukukhomba lokhu kusengaphambili.

Isibonelo, uma uneqembu labantu okufanele lihlanganiswe ngokusekelwe kubunikazi bobulili obufana nobunqunu njengowesilisa noma wesifazane, ukubiza i-k-means algorithm usebenzisa okokufaka k = 3 kuyobaphoqelela abantu zibe ngamaqoqo amathathu uma nje kuphela, noma okokufaka k = 2, kuzohlinzeka ngokufanelekile kwemvelo.

Ngokufanayo, uma iqembu labantu ngabanye lihlanganiswa kalula ngokusekelwe esimweni sasekhaya futhi ubizwa ngokuthi i-k-means algorithm ne-input k = 20, imiphumela ingahle ibe yinto ejwayelekile ukuze iphumelele.

Ngenxa yalesi sizathu, kuvame ukuthi umqondo omuhle ukuzama amanani ahlukene k ukuze ubone inani elifanelana nedatha yakho. Futhi ungase ufise ukuhlola ukusetshenziswa kwamanye ama-algorithms wedatha yedatha ekufuneni kwakho kolwazi oluthathwa ngumshini.