Centering Map Viewports Near Locations while Preserving Privacy

Last week, I was at a #pdxgeekout session at the Ace Hotel in downtown Portland. @caseorganic, @aaronpk, @brennannovak, and @maxogden were there and Aaron was working on geoloqi and asked everyone how to best randomize the location data used to center the map.

Why try and mask someone’s location? I could set up geoloqi to let people I don’t know leave message that I get only when I pass withing a certain distance from that message. How do they leave a message in an area that I’m likely to get the message? They have to be able to see where I’m likely to find a message. This means the map they see needs to be centered near my location. Since geoloqi is open source, everyone has access to the source. If the interface for people to leave messages is public — which it is, then anyone can query it. If it just returned my location offset by a random amount, given a certain number of queries the centroid of the returned locations would be, quite accurately, my location.

What geoloqi needs is a method of offsetting the current location which statistically cannot be tricked or munged into reproducing one’s current location. This is the major constraint: How does one ensure that a dedicated (stalker/zombie/attacker/real life spamer/salesman) cannot reproduce one’s location, given the algorithm and a set of data requested from the public API for the map center?

Instead of returning a location that is only a random offset from my current location, I want it to take my history into account. I would like it to take all of my history in a certain region — let us call it \mathcal{R} — and assign a certain probability of me being there based on a sampling of the points. How do I go about doing this?

Let’s take a basis function and look at it:

\displaystyle B(\mathbf{x}) = e^{-{\frac{x_{lat}^2+x_{lang}^2}{2\sigma}}}

This basis function is highest at the origin and does an exponential decay as the radius from the origin is increased and \sigma defines how slowly the function decreases as a function of the radius (it is similar to the standard deviation in the normal distribution, but I’m not using that here). Using this function we can simulate the uncertainty of finding someone at a specific location. Here’s what it looks like graphed out:

2-d Gaussian as Radial Basis Function

By taking a sampling of points from the history of places they’ve been, we can create a crude probability map of where someone could likely be given their current location \mathbf{x} . All we need to do is center this function around all of the sampled points. Let us call the sampling of the history points \mathcal{H} . The probability function over that region is:

\displaystyle P_{\mathcal{H}}(\mathbf{x}) = \sum_{\mathbf{p}\in\mathcal{H}}{ e^{ - \left( \frac{(x_{lat}-p_{lat})^2 + (x_{lng}-p_{lng})^2}{2\sigma} \right)}}

Where p_{lat} and p_{lng} are the latitude and longitude of each point \mathbf{p} in the location history \mathcal{H} .

Let’s say someone’s been undulating through town and their path looks something like this:

Sinusoid composed of radial basis functions

As we can see, because of the rate of change in the \sin function the sampled points get closer as the function’s derivative approaches zero. Our next question is, “How do we create a well distributed sampling of points?” We need to look at our basis function and look at how it decays with distance. If we choose a radius where the height is 0.5, the height of the probability function should stay around unity (unlike the graph above). Here’s the equation for that:

\displaystyle r = \sqrt{-2\sigma\ln(\frac{1}{2})}

As long as the points are spaced r apart, the probability function P should remain nearly normalized. If we wanted to make this better behaved, we would weight each basis according to how long it has been since they’ve been there and how many adjacent sample points there are.

Now that we have our main function P , we need to calculate the directional derivative so we can compute tangent lines and and their intersections with the xy plane at z = 0 :

\displaystyle \partial_{\mathbf{u}}P_{\mathcal{H}}(\mathbf{x}) = \sum_{\mathbf{p}\in\mathcal{H}}{ \left( \frac{x_{lat}u_{lat}-p_{lat}u_{lat} + x_{lng}u_{lng}-p_{lng}u_{lng}}{\sigma} \right) e^{ - \left( \frac{(x_{lat}-p_{lat})^2 + (x_{lng}-p_{lng})^2}{2\sigma} \right)}}

Now we can choose a random point \mathbf{p}_r in the neighborhood of \mathbf{p}_{curr} — our current location, a semi-random unit vector \mathbf{u} , and then calculate a tangent line and it’s intersection with the ground plane.

Let \mathbf{u} = (\cos\theta_{\mathbf{u}}, \sin\theta_{\mathbf{u}}) . Let \theta_{\mathbf{u}} be within a range that is nearly perpendicular to a line through two historical points (say, \mathbf{p}_1 and \mathbf{p}_2 ) adjacent to the current point, so that \left|\mathbf{u} \cdot \frac{\mathbf{p}_{1}\mathbf{p}_{2}}{\|\mathbf{p}_{1}\mathbf{p}_{2}\|}\right| \leq \frac{1}{2}

Keep in mind that \mathbf{u} is defined with a frame of reference of our coordinate system, while the above dot product is independent of the frame of reference. If we wanted to calculate the possible angles that \theta_{\mathbf{u}} could take, we need to first calculate the angle between (p_1, p_2) and the \mathbf{x} axis:

\displaystyle \theta_{\mathbf{e}_1} = \arccos \left(\mathbf{e}_{1}\cdot\frac{\mathbf{p}_{1}\mathbf{p}_{2}}{\| \mathbf{p}_{1}\mathbf{p}_{2} \|} \right)

Where \mathbf{e}_1 is the basis vector for the x axis, or (1, 0) .

Then we know that our random angle can be anywhere from \left[ -\frac{\pi}{3}, \frac{\pi}{3}\right] (because \arccos\frac{1}{2} = \frac{\pi}{3} ). Then to find the angle of \mathbf{u} we find \theta_{\mathbf{u}} = random\left[ -\frac{\pi}{3}, \frac{\pi}{3}\right] + \theta_{\mathbf{e}_1} .

Now, we evaluate P_{\mathcal{H}}(\mathbf{p}_{r}) , \partial_{\mathbf{u}}P_{\mathcal{H}}(\mathbf{p}_{r}) , and find \delta_{\mathbf{u}} :

\displaystyle \delta_{\mathbf{u}} = - \frac{P_{\mathcal{H}}(\mathbf{p}_{r})}{\partial_{\mathbf{u}}P_{\mathcal{H}}(\mathbf{p}_{r})}

Then we find the xy point \mathbf{p}_{new} = \mathbf{p}_{r} + \delta_{\mathbf{u}}\mathbf{u} . If P_{\mathcal{H}}(\mathbf{p}_{new}) is below a certain threshold then we can safely return \mathbf{p}_{new} . If not we set \mathbf{p}_{r} = \mathbf{p}_{new} and start the iteration again. Lather. Rinse. Repeat as necessary.

Once this point is computed for one person, it should be cached until their location is updated again. If you want to return random points around this \mathbf{p}_{new} , ensure the region of return points doesn’t intersect the current region within which the person is located.

Since \mathbf{p}_{new} is dependant on \partial_{\mathbf{u}}P_{\mathcal{H}}(\mathbf{p}_r) ,  care should be taken to ensure it does not fall outside of the desired zoom range for the map. The possible range of values for \mathbf{p}_{new} can be calculated, but it is a very messy problem in the general case. It is left as an exercise to the reader. 😉

Profound Similarities

I was messing around with some ideas the other day and noticed some similarities between some very beautiful identities.

\displaystyle e^{\theta\imath} = \cos\theta + \imath\sin\theta, \displaystyle \varphi^{n} = \frac{L_{n} + F_{n}\sqrt{5}}{2}

Here are identities for each of the functions above:

\displaystyle \cos\theta = \frac{e^{\theta\imath} + e^{-\theta\imath}}{2}, \displaystyle \sin\theta = \frac{e^{\theta\imath} - e^{-\theta\imath}}{2i}
\displaystyle L_n = \varphi^{n} + (1-\varphi)^{n}, \displaystyle F_n = \frac{\varphi^{n} - (1-\varphi)^{n}}{\sqrt{5}}

Cosine is an even function, and the nth Lucas number is also given by an even function. Sine is and odd function, and the nth Fibonacci number is given by one also. But what should be remarkable is the similarity of the definitions.

Here’s something else that is really awesome. One should notice e^{\theta\imath} = \cos\theta + \imath\sin\theta and e^{-\theta\imath} = \cos\theta - \imath\sin\theta are conjugates. One should also notice that \varphi = \frac{1 + \sqrt{5}}{2} and 1 - \varphi = \frac{2}{2} - \frac{1+\sqrt{5}}{2} = \frac{1-\sqrt{5}}{2} are conjugates as well.

Now, I can define an operator and it’s conjugate or dual that can be used to simply write the relationship between all the above functions and more. I’ll use the symbol \heartsuit, because I don’t know what it’s real use is and I <3 this relation.

\displaystyle \heartsuit F = F + \overline{F}, \displaystyle \heartsuit^{\ast} F = F - \overline{F},

where \overline{F} represents the conjugate of a function F.

Now it’s possible to write:

\displaystyle \heartsuit\varphi^{n} = L_n, \displaystyle \frac{1}{\sqrt{5}}\heartsuit^{\ast}\varphi^{n} = F_n
\displaystyle \frac{1}{2}\heartsuit e^{\theta\imath} = \cos\theta, \displaystyle \frac{1}{2\imath}\heartsuit^{\ast}e^{\theta\imath} = \sin\theta

Now if I let \overline{e^{\theta\imath}} = e^{-\theta\imath} and then generalize (perhaps wrongly, but wait and see what happens!) \overline{e^{\theta}}=e^{-\theta}, then I can write:

\displaystyle \frac{1}{2}\heartsuit e^{\theta} = \cosh\theta, \displaystyle \frac{1}{2}\heartsuit^{\ast}e^{\theta} = \sinh\theta

This is in perfect accordance with the (lesser?) known identities \cos(\theta\imath) = \cosh\theta and \sin(\theta\imath) = - \sinh\theta and the identities:

\displaystyle \cosh\theta = \frac{e^{\theta} + e^{-\theta}}{2}, \displaystyle \sinh\theta = \frac{e^{\theta} - e^{-\theta}}{2}

\blacksquare

Just so you can stare at them all together, here they are:

\displaystyle \heartsuit\varphi^{n} = L_n, \displaystyle \frac{1}{\sqrt{5}}\heartsuit^{\ast}\varphi^{n} = F_n
\displaystyle \frac{1}{2}\heartsuit e^{\theta\imath} = \cos\theta, \displaystyle \frac{1}{2\imath}\heartsuit^{\ast}e^{\theta\imath} = \sin\theta
\displaystyle \frac{1}{2}\heartsuit e^{\theta} = \cosh\theta, \displaystyle \frac{1}{2}\heartsuit^{\ast}e^{\theta} = \sinh\theta

Interesting Fibonacci Goodness

Last year I was working on deriving a matrix based method for translating and rotating various polygons around pentagons. In the process of this I got sidetracked and started to look into the golden mean. People swoon over how it defines the most beautiful rectangles! It crops up inside pentagons and various 3-D polyhedra. It’s glorious. It’s just like how a circle has the ratio pi, and it has a name too: phi. I’ll be writing it down in Greek from here on out (it’s originally from there anyway). So learn this shape: \varphi Sometimes people write it out like this: \phi Either way it’s just a variable name. Sometimes it refers to something other than the golden mean, but here it’s the golden mean. As I found powers of \varphi , I kept noticing certain values popping up in sequence. It turns out that one can find powers of \varphi through the following formula:

\displaystyle\frac{\sqrt{5}F_n + L_n}{2} = \varphi^n

Where F_n and L_n are the nth Fibonacci and Lucas numbers, respsectively.

What are these numbers? Well, you probably know about the Fibonacci numbers, but I’ll go over them here. They follow a certain sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144… and so on. Typically, one starts counting with one, but they really start with zero. Start with a zero and a one. The next number is 0 + 1 = 1. The next number is 1 + 1 = 2. After that, 1 + 2 = 3. One keeps adding the current number to the previous number to get the next number. These numbers crop up in pine cones, sunflowers, and a myriad of other places.

The other not so well known sequence, the Lucas numbers, are defined the same exact way, but with different starting numbers: 2, 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, 199, 322, 521… forever and ever, amen. Start with 2 and 1. 2 + 1 = 3. 1 + 3 = 4. 3 + 4 = 7. If you’ve read Godel, Escher, Bach: An Eternal Golden Braid you may have run across this sequence. They grow the same way the Fibonacci numbers do. If one takes the ratio of the Lucas numbers to the Fibonacci numbers as they grow, one will find it approximates the square root of five.

I wasn’t certain that I was writing things out properly, so I started writing out everything again. Then, I decided to take my knowledge of the binomial theorem and apply it to this question. The binomial theorem is really awesome. It’s related to Pascal’s triangle. If you think that’s cool you’re in luck. If you haven’t hadn’t heard of it yet. Fearest not, I shall walk thee through it. Pascal’s triangle looks like this:

\displaystyle 1

\displaystyle 1 \qquad 1

\displaystyle 1 \qquad 2 \qquad 1

\displaystyle 1 \qquad 3 \qquad 3 \qquad 1

\displaystyle 1 \qquad 4 \qquad 6 \qquad 4 \qquad 1

If you add two adjacent numbers in a row together you get the number that goes in between the two numbers on the next row down. It can be expressed in terms of a nifty little operation known as the factorial. The factorial takes a number and multiplies that number by every counting number less than it. So, if I take the factorial of 3, written 3!, I get 1 x 1 x 2 x 3 = 6. If I want to find the kth number of the nth row, of Pascal’s triangle I use the choose function. It is written out like this:

\displaystyle{n \choose k} = \frac{n!}{k!(n-k)!}

This tells me how many ways I can choose k items out of n total items. It also tells me what the kth coefficient is for a binomial raised to the nth power. If I have the binomial x +  y and I square it, I end up with x^2 + 2xy + y^2 . Let me make something more obvious, (1)x^2 + (2)xy + (1)y^2 . Notice a pattern yet? Try this one, (1)x^3 + (3)x^2y + (3)xy^2 + (1)y^3 . See the rows in the triangle appearing? Let me arrange those starting with raising the binomial the the power of 0:

\displaystyle (1)

\displaystyle (1)x + (1) y

\displaystyle (1)x^2 + (2)xy + (1)y^2

\displaystyle (1)x^3 + (3)x^2y + (3)xy^2 + (1)y^3

\displaystyle (1)x^4 + (4)x^3y +(6)x^2y^2 + (4)xy^3 + (1)y^4

Okay, okay, I’ll stop. It should be entirely too obvious by now. Hold onto this because I’ll come back to it in a moment.

First let’s talk about a number I mentioned before. The golden mean is defined mathematically as the ratio:

\displaystyle \varphi = \frac{1+\sqrt{5}}{2}

This, my friends, is a binomial. It’s a really special binomial, it is easy to work with and has some cool properties when the binomial theorem is applied to it. Now, let me introduce you to another formula called the Binet formula. It gives one the nth number of the Fibonacci sequence.

\displaystyle F_n = \frac{\varphi^n - (1-\varphi)^n}{\sqrt{5}}

It’s a nice little gem of a formula, but let’s write it out in terms of the binomials it contains:

\displaystyle F_n = \frac{(1+\sqrt{5})^n-(1-\sqrt{5})^n}{2^n\sqrt{5}}

It may look more frightening, but trust me, it’s simpler this way. There is some “magic” we can do with the binomials on the top. Let me show you, but first I’ll need to introduce you to the summation sign. He’s a simple fellow who might look scary. All he does is add things together. Lets watch him add all the numbers from 1 to 5.

\displaystyle 1 + 2+ 3 +4 + 5 = \sum_{i=1}^5{i}

That’s all there is to it, just start with the number 1 in the variable i, evaluate the expression, increment i, evaluate the expression, add it to the previous result, rinse, repeat as needed.

Now, I’ll continue. The term (1 + \sqrt{5})^n can be written as a series sum \sum_{i=0}^n{{n \choose i}\sqrt{5}^i} and the term (1 - \sqrt{5})^n can be written as \sum_{i=0}^n{{n \choose i}(-\sqrt{5})^i} . Put these together and several things happen because of the minus sign. It actually sorts out the odd powered terms, getting rid of the even terms (because the even terms are subtracted from each other, while the odd terms are added to each other). That leaves:

\displaystyle F_n = \frac{1}{2^n\sqrt{5}}\sum_{i \nmid 2}^n{{n \choose i}2\sqrt{5}^i} ,

which reduces to,

\displaystyle F_n = \frac{1}{2^{n-1}}\sum_{i \nmid 2}^n {{n \choose i}5^{\frac{i-1}{2}}}

,
and the notation \displaystyle i\nmid2 means all i not divisible by 2 or all i mod 2 = 1. It’s the notation used by Ireland and Rosen, not to be confused with other notation meaning the first number divides the second number.

Now you look at me and ask, “How did that get any simpler”? It didn’t really, but this has the property of proving that all numbers in the sequence will be integers, otherwise known as nice round whole numbers! How? Notice that we are summing over all odd numbers less than n starting with 1, if one subtracts 1 from an odd number one gets an even number back. An even number is divisible by two, so that fraction in 5’s exponent reduces to an integer.

Oh, I guess I still need to prove that 2 raised to n-1 evenly divides the sum. In the case of n = 0, the sum equals zero and 0 times 2 = 0. For n = 1, we have 1 times 1 = 1. Now, for every other case there is a 1 plus an odd number multiplied by a coefficient that grows faster than 2 raised to n-1. I give up for now. 🙂 I just know that there are enough twos being multiplied and added together that 2 raised to n-1 divides out evenly.

The corresponding formulas for the Lucas series are:

\displaystyle L_n = \frac{(1+\sqrt{5})^n+(1-\sqrt{5})^n}{2^n}

and

\displaystyle L_n = \frac{1}{2^{n-1}}\sum_{i \mid 2}^n {{n \choose i}5^{\frac{i}{2}}}

where the summation is over all even integers less than n divisible by two starting with zero, or where the condition i mod 2 = 0 holds.

The next exploration I have in mind for this is to derive something called a q-analog of these series. Also, (and I’m stoked about this!) I want to look at what the Fibonacci and Lucas sequences look like when extended into the complex plane.

Updated code:

Here is some Ruby code for calculating the Fibonacci and Lucas numbers using my derived methods. I took advantage of the functional aspects of Ruby to simply writing out the computation and sorting the odd and even terms before applying the rest of the computations. Another optimization I made was to create a factorial lookup table. Instead of computing the factorial every time, it calculates all factorials in an array and looks up the number (resulting in quite a speed increase, but the ). Also, note that this version uses my fully simplified identities and will obviously only return integers. The past version suffered from rounding errors, and the result had to be cast back into an integer. Just copy and past the following into fibonacci.rb, and type ruby fibonacci.rb 100 to get the first one hundred numbers in each sequence.


base = 5
num = ARGV[0].to_i

def is_even(n)
  n % 2 == 0
end

def is_odd(n)
  n % 2 == 1
end

def sum (set)
  set.inject(0) { |total, i| total + i }
end

def setup_factorial(n)
  f = Array(0..n)
  f.map! { |i| if i == 0 then 1 else i*f[i-1] end }
end

def choose(n, k, f)
  f[n] / ( f[k] * f[n-k] )
end

def lucas(base, range, f)
  lucas = Array(range)
  range.each do |n|
    lucas[n] = Array(0..n)
    #compute lucas terms
    lucas[n] = lucas[n].select { |i| is_even(i) }
    lucas[n].map! { |i| choose(n, i, f) * base **(i/2) }
    lucas[n] = sum(lucas[n]) / (2**(n-1))
  end
  lucas
end

def fibonacci (base, range, f)
  fibonacci = Array(range)
  range.each do |n|
    fibonacci[n] = Array(0..n)
    #compute fibonacci terms
    fibonacci[n] = fibonacci[n].select { |i| is_odd(i) }
    fibonacci[n].map! { |i| choose(n, i, f) * base **((i-1)/2) }
    fibonacci[n] = sum(fibonacci[n]) / (2**(n-1))
  end
  fibonacci
end

puts "Setting up factorial lookup table..."
factorial = setup_factorial(num)
puts "Done."
puts "Calculating series..."
fibonacci(base, 0..num, factorial).each { |i| print i, ", " }
puts

Old Code

Here is some ruby code for calculating the Fibonacci and Lucas numbers using my derived methods. Please note that it inefficiently sums over both even and odd numbers in both series, but multiplies the unneeded results with a zero. It also cheats the rounding errors by casting the floats back into an integer (which might actually cause some numbers to be wrong).


def is_odd(n)
  n%2
end

def is_even(n)
  (n+1)%2
end

def sum (range)
  range.inject(0) { |total, i| total + yield(i) }
end

def factorial(n)
  if n == 0
    1
  else
    n * factorial(n-1)
  end
end

def choose(n, k)
  factorial(n) / ( factorial(k) * factorial(n-k) )
end

def fibonacci(n)
  ( sum(0..n) { |j| is_odd(j) * choose(n, j) * (Math.sqrt(5)**(j-1)) / 2**(n - 1) } ).to_i
end

def lucas(n)
  ( sum(0..n) { |j| is_even(j) * choose(n, j) * (Math.sqrt(5)**j) / 2**(n - 1) } ).to_i
end

(0..100).each { |j| print fibonacci(j), ", " }
puts
(0..100).each { |j| print lucas(j), ", " }
puts

Today is really Square Day!

——– Original Message ——–
Subject: Happy Square Day!
Date: Thu, 05 Mar 2009 17:24:39 -0600
From: Roger Hill
Reply-To: Handheld Computing Conference discussion list
Organization: Southern Illinois University at Edwardsville
To: hhc@lists.brouhaha.com, ****@deepthought.com

Hi all,

Recently there was something in the news about “Square Root Day”,
03/03/09, and Richard Nelson circulated an E-mail about it. I was
corresponding with a friend about this, and he suggested asking what
you would get if you just ran the day, month, and 4-digit year
together as MMDDYYYY and looked for perfect squares. For example,
today in American-style notation is 3/05/2009 which runs together as
the number 3052009.

So I checked on that, and was rather startled to discover that the
nearest “square day” is actually TODAY! Happy Square Day!!

Naturally I had to turn a computer loose on the problem, so using
Excel I found that today is only the 6th square day since 1900 (which
is as far back as Excel deals with dates). The last one was
9/01/2004. But the next one is only a few weeks away, 4/01/2009 (no
fooling!). After that, you’ll have to wait till 2/26/2016.

If (like me) you favor the YYYY/MM/DD notation, today is not a square
day but the next one will be 2015/11/21. I have not yet tried to
figure it out for the European-style (DDMMYYYY) format.

— Roger

(P.S. I’m sending this to both the HHC and the CHIP lists; sorry if
you get it twice as a result.)

_______________________________________________
HHC mailing list
HHC@lists.brouhaha.com
http://lists.brouhaha.com/mailman/listinfo/hhc