Is there any way of ordering/sorting vectors?

https://datascience.stackexchange.com/questions/41345

k-nn

01-11-2019
|

Question

I am working on KD-Tree for nearest neighbor algorithm, where at each level of the tree we arbitrarily choose a dimension to cut upon and sort the points based on that chosen dimension value, after which we can choose mid-point to split upon such that dividing the points from that middle point will give us a balanced tree (has same number of points in either of the subtrees). At each level a single dimension is chosen and process is repeated.

For example: for data having dimensions (x,y,z) let's say first level of the tree 'X' is chosen at cut dimension, 'Y' at second level, 'Z' at third and then repeated again with 'X' again at fourth level and so on. So for first level all points are sorted according to dimension 'X' and split into left and right subtree based on the middle value of the sorted data.

I am sure you got the gist of it. Otherwise please read more here. KD-Tree (Wiki)

Now what I want to try something where rather than using each dimension at different level, I use all at the same level. i.e. I take points p(x,y,z) as vectors. But as you can guess if I do this, I have no way of ordering (not any I know of) the vectors (points) such that I can find a vector to divide upon hence still keeping my tree somewhat balanced.

PS: I am not much concerned about if it is actually a better or worse way of doing thing. Right now it's more about if it is possible to do it this way and then find out why it is better or worse.

Please ask if I should provide any more information or if I mentioned something wrong. I appreciate your help.

Please add more tags if appropriate, I couldn't find any.

Edit: Just to make it clear. Basically I want to be able to have an ordering which won't destroy the order based on distances. i.e. if a point/vector A has less distance from another point P, it comes first in the order than some farther point, let's say B. My end goal is to use this to try and create some kind of nearest neighbor algorithm.

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange