Indiana native. Purdue grad. Programmer / Dev Ops in trade. Dog owner. Husband and father. Have questions? Ask! 96 stories · 1 follower

## #1541 – Solved (No Comments)by Chris Sunday June 18th, 2017 at 11:42 AM

1 Share

Read the whole story
3 days ago
Central Indiana

## Guy makes Monkey Island grog, anticlimactically doesn't die Saturday May 20th, 2017 at 9:30 PM

1 Comment

Monkey Island fanatic and Eurogamer personality Johnny Chiodini decided to answer a gauntlet that was never thrown by creating grog using the ingredients listed in The Secret of Monkey Island. And although he punts on third down by subtracting the more lethal ingredients, the end result is still gross enough that we award him and his two confederates eight Monkey Bucks for introducing it into their digestive system.

Here is the write-up, and below is the video proof of the reckless endeavor:

Read the whole story
32 days ago
This explains a lot about the pirates in that game.
Central Indiana

## Trajectory recovery from Ash: User privacy is NOT preserved in aggregated mobility databy adriancolyer Monday May 15th, 2017 at 6:55 AM

1 Comment and 2 Shares

Borrowing a little from Simon Wardley’s marvellous Enterprise IT Adoption Cycle, here’s roughly how my understanding progressed as I read through this paper:

Huh? What? How? Nooooo, Oh No, Oh s*@\#!

Xu et al. show us that even in a dataset in which you might initially think there is no chance of leaking information about individuals, they can recover data about individual users with between 73% and 91% accuracy. Even in datasets which aggregate data on tens of thousands to hundreds of thousands of users! Their particular context is mobile location data, but underpinning the discovery mechanism is a reliance on two key characteristics:

1. Individuals tend to do the same things over and over (regularity) – i.e., there are patterns in the data relating to given individuals, and
2. These patterns are different across different users (uniqueness).

Therefore, statistical data with similar features is likely to suffer from the same privacy breach. Unfortunately, these two features are quite common in the traces left by humans, which have been reported in numerous scenarios, such as credit card records, mobile application usage, and even web browsing. Hence, such a privacy problem is potentially severe and universal, which calls for immediate attention from both academia and industry.

(Emphasis mine). As we’ll see, there are a few other details that matter which in my mind might make it harder to transfer to other problem domains – chiefly the ability to define appropriate cost functions – but even if it does relate only to location-based data, it’s still a very big deal.

Let’s take a look at what’s in a typical aggregated mobility dataset, and then go on to see how Xu et al. managed to blow it wide open.

### Aggregated mobility data (ash)

In an attempt to preserve user privacy, owners of mobility data tend to publish only aggregated data, for example, the number of users covered by a cellular tower at a particular timestamp. The statistical data is of broad interest with applications in epidemic controlling, transportation scheduling, business intelligence, and more. The aggregated data is sometimes called the ash of the original trajectories.

The operators believe that such aggregation will preserve users’ privacy while providing useful statistical information for academic research and commercial usage.

To publish aggregated mobility data, first the original mobility records of individual mobile users are grouped by time slots (windows), then for each window, aggregate statistics are computed (e.g., the number of mobile users covered by each base station).

Two desirable properties are assumed to hold from following such a process:

1. No individual information can be directly acquired from the datasets, with the aggregated mobility data complying with the k-anonymity privacy model.
2. The aggregate statistics are accurate.

Although the privacy leakage in publishing anonymized individual’s mobility records has been recognized and extensively studied, the privacy issue in releasing aggregated mobility data remains unknown.

Looking at this from a differential privacy perspective, you might already be wondering about the robustness of being able to detect e.g., whether or not a particular individual is represented in the dataset. But what we’re about to do goes much further. Take a dataset collected over some time period, which covers a total of M locations (places). In each time slot, we have the number of users in each of those places (i.e., an M-dimensional vector). That’s all you get. From this information only, it’s possible to recover between 73% – 91% of all the individual users’ trajectories – i.e., which locations they were in at which times, and hence how they moved between them.

Huh? What? How?

### Revealing individual users from aggregate data

The first thing it’s easy to work out is how many individual users there are at any point in time, just sum up all of the user counts for each place at that time. Call that N.

Now we need to consider what other clues might be available in the data. Individual users tend to have fairly coherent mobility trajectories (what they do today, they’re likely to do again tomorrow), and these trajectories are different across different users. See for example the trajectories of five randomly selected selected users over two days in the figure below. Each users’ pattern is unique, with strong similarity across both days.

The key to recovering individual trajectories is to exploit these twin characteristics of regularity for individual users, and uniques across users. Even so, it seems a stretch when all we’ve got is user counts by time and place!

We’re going to build up estimates for the trajectories of each of the N users, one time step at a time. Let the estimate for the trajectory of the $i$th individual user at time step t , $s^{t}_{i}$ be represented by a sequence of locations $[q^{1}_{i}, q^{2}_{i}, q^{3}_{i}, ..., q^{t}_{i}]$. For example, here’s a world with M=9 locations:

And here’s a trajectory for a user to time t, represented as a sequence of t locations:

We want to decide on the most likely location for each user at time step t+1, subject to the overall constraint that we know how many users in total there must be at each location at time t+1. The description of how this next part works is very terse in the paper itself, but with a little bit of reverse engineering on my part I believe I’ve reached an understanding. The secret is to formulate the problem as one of completing a decision matrix. This decision matrix takes a special form, it has N rows (one for each user trajectory to date), and N columns. Say we know that there must be two users in location 1, and four users in location 2. Then the first two columns in the matrix will represent the two ‘slots’ available in location 1, and the next four columns will represent the four slots available in location 2, and so on.

In the decision matrix $X^t$, $x^{t}_{i,j} = 1$ if next location for trajectory $i$ is the location identified by column $j$, and zero otherwise. A valid completion of the matrix has every row adding up to one (each trajectory is assigned one and only one next location), and every column adding up to one (each slot is filled by one and only one user).

Thus the example decision matrix above indicates that the next location for the trajectory of user #3 has been assigned as location 5.

When we complete the decision matrix, we don’t just make random assignments of slots of course! That’s where the cost matrix $C^c$ comes in. The cost matrix is also an NxN matrix, with the same row and column structure as the decision matrix. Instead of being filled with 1’s and 0’s though, $c^{t}_{i,j}$ contains a value representing the cost of moving to the location for slot $j$ given the trajectory so far for user $i$. Take for example the trajectory for a user in the illustration below, which currently finishes in location 6. We might use as the cost of moving to each potential next location simply the number of hops in the grid to get there (red numbers). The actual cost functions used are more complex than this, this example is just to help you get the idea.

Here then is what a subsection of the cost matrix for the user in the above sketch might look like:

We’ll return to to how to define the cost functions in a moment. For now note that the problem has now become the following:

The above formulated problem is equivalent to the Linear Sum Assignment Problem, which has been extensively studied and can be solved in polynomial time with the Hungarian algorithm.

Space prevents me from explaining the Hungarian algorithm, but the Wikipedia link above does a pretty good job of it (it’s actually pretty straightforward, check out the section on the ‘Matrix interpretation’ to see how it maps in our case).

Thus far then, we’ve discovered a way to represent the problem such that we can recover each step of individual trajectories in polynomial time, so long as we can define suitable cost matrices.

Noooooo.

### Building cost matrices – it’s night and day

At night time, people tend not to move around very much, as illustrated by these plots from two different datasets (the ‘operator’ dataset and the ‘app’ dataset) used in the evaluation:

Not only that, but the night time location of individual users tends to be one of their top most visited locations (often the top location):

For the night time then, it makes sense to use the distance between the location of the user trajectory at time t and the location being considered for time t+1 as the cost.

In the daytime, people tend to move about.

The key insight is the continuity of human mobility, which enables the estimation of next location using the current location and velocity.

Let the estimated next location using this process be $l$, then we can use as the cost function the distance between the $l$ and the location being considered for time t+1.

It is worth noting that the Hungarian algorithm is currently the most efficient algorithm to solve [the linear sum assignment problem], but still has computational complexity of $O(n^3)$. To speed things up, we adopt a suboptimal solution to reduce the dimension of the cost matrix by taking out the pairs of trajectories and location points with cost below a predefined threshold and directly linking them together.

### Linking trajectories across days

Using the night and day approaches, we can recover mobile users’ sub-trajectories for each day. Now we need to link sub-trajectories together across days. Here we exploit the regularity exhibited in the day-after-day movements of individual users.

Specifically, we use the information gain of connecting two sub-trajectories to measure their similarities.

The entropy in a trajectory is modelled based on the frequency of visiting different locations, and the information gain from linking two sub-trajectories is modelled as the difference between the entropy of the combined trajectory, and the sum of the entropies of the individual trajectories over two. For the same user, we should see relatively little information gain if the two trajectories are similar, whereas for different users we should see much larger information gain. And indeed we do:

To conclude, we design an unsupervised attack framework that utilizes the universal characteristics of human mobility to recover individuals’ trajectories in aggregated mobility datasets. Since the proposed framework does not require any prior information of the target datasets, it can be easily applied on other aggregated mobility datasets.

Once the individual trajectories are separated out, it has been shown to be comparatively easy, with the help of small amounts of external data such as credit card records, to re-identify individual users (associated them with trajectories).

Oh no.

### Evaluation

The authors evaluate the technique on two real world datasets. The ‘app’ dataset contains data for 15,000 users collected by a mobile app which records a mobile user’s location when activated, over a two-week period. The ‘operator’ dataset contains data for 100,000 mobile users from a major mobile network operator, over a one week period. Tests are run on aggregate data produced from these datasets, and the recovered trajectories are compared against ground truth.

In the figures that follow, stage #1 represents night time trajectory recovery, stage #2 day time trajectory recovery, and stage #3 the linking of sub-trajectories across days. Here we can see the recovery accuracy for the two datasets:

For the app dataset, 98% of night time trajectories are correctly recovered, failing to 91% accuracy by the final step. The corresponding figures for the operator dataset are 95% and 73%.

For the recovered trajectories, the following chart shows the percentage that can be uniquely identified given just the top-k locations for k=1 to 5.

From the results, we can observe that given the two most frequent locations of the recovered trajectories, over 95% of them can be uniquely distinguished. Therefore, the results indicate that the recovered trajectories are very unique and vulnerable to be reidentified with little external information.

To put that more plainly, given the aggregated dataset, and knowledge of your home and work locations, there’s a very good chance I can recover your full movements!!!

Oh s*@\#!

Decreasing the spatial resolution (i.e., using more coarse-grained locations) actually increases the chances of successful trajectory recovery (but only to the location granularity of course). It’s harder to link these recovered trajectories to individual people though as human mobility becomes less unique in coarser-grained datasets.

Decreasing the temporal resolution (only releasing data for larger time windows) increases both the chances of successful trajectory recovery, and of re-identification.

The best defence for preserving privacy in aggregated mobility datasets should come as no surprise to you – we need to add some carefully designed random noise. What that careful design is though, we’re not told!

… a well designed perturbation scheme can reduce the regularity and uniqueness of mobile users’ trajectories, which has the potential for preserving mobile users’ privacy in aggregated mobility data.

Read the whole story
38 days ago
This is a... scary problem.
Central Indiana

## Editor's Soapbox: Basic Mannersby snoofle Wednesday May 3rd, 2017 at 7:52 PM

2 Shares
As someone who's been accused of "not being a team player" because I had the temerity to say, "No, I can't come in on short notice on a day I've called off, because I'm busy,", Snoofle's rant struck a nerve. I lend him the soapbox for today. -- Remy

When you're very young, your parents teach you to say please and thank you. It's good-manners 101. Barking give me ..., get me ... or I want... usually gets you some sort of reprimand. Persistent rudeness yields reprimands of increasing sternness such as no dessert, no TV, etc. Ideally, once learned, those manners should follow us into the grown-up world.

Should.

When you work in IT, particularly in the financial industry, especially on critical systems on which all firm-trading is based, you tend to work with people who think that they are the only people on Earth, and that their individual problem is more important than everyone and everything else. You make a whole lot of money, but you get a lot of abuse from both users and managers. Over time, you develop a pretty thick skin and learn to not take anything personally. Of course, eventually you get tired of dealing with the same old arrogance. Late last summer, I had finally had it with the Wall Street stupidity and decided to retire.

After a few months of taking it easy, I started to do volunteer work teaching elderly people the basics of technology. It's not exactly challenging to explain the fundamental differences between Notepad, Word, e-mail, IM, Skype and so forth, but every once in a while, one of the 1940's-set gets it, and it just makes it all worthwhile.

One day, I ran into someone who runs a large retail operation. He offered me a part time job to do routine work in his warehouse-type store, and I could work whatever hours I wanted. It's rote, mindless work, but I get to meet and talk to new people all the time (which I think is great) so I accepted.

After a couple of weeks of doing this and the volunteer work, I noticed a vast difference in how people treat you compared with work in IT. For one thing, when people ask you to do something, they start with the word please and finish by saying thank you. Another is that when something needs to be done, they don't assume "Magic Happens Here"; they actually try and figure out how much effort is involved in the task before promising someone else that it will be done after some arbitrary time interval.

Nearly four decades in IT has allowed me to accumulate a rather large box of assorted PC parts and cables. When someone has a problem with their machine, more often than not, I can find something in that box that fixes their problem. The women bake me pies, and I've gotten more than a few bottles of booze in appreciation. At the box store, when customers ask you a question, they don't want to tell you what they think; they actually want to hear the answer. How often does that happen in IT?

When something happens when I'm not around and people call me for help, they're actually apologetic for interrupting my personal time and ask if it's convenient or if there's a better time to talk. When was the last time anyone at work interrupted you in the middle of the night or weekend, and assumed that you might be doing something, I don't know, personal?

I realize the IT industry is comparatively new, but it's been around for more than fifty years at this point. You'd think that after a half century that people might have learned that technical people are not there to be abused, and deserve to be treated decently, like anybody else. Instead, they seem to have forgotten those early lessons in manners.

[Advertisement] Infrastructure as Code built from the start with first-class Windows functionality and an intuitive, visual user interface. Download Otter today!
Read the whole story
49 days ago
Central Indiana

## Hoosier Weddings through the (P) Agesby Justin Clark Thursday April 27th, 2017 at 9:59 PM

1 Comment

The New York Times recently ran a piece about its long and interesting history of wedding notices, specifically its first notice published on September 18, 1851. Sarah Mullett and John Grant were married by the Reverend Thomas P. Tyler at Trinity Episcopal Church in Fredonia, New York on September 10, 1851. It got us at Hoosier State Chronicles thinking about wedding notices in our neck of the woods. Throughout the decades, newspapers from all across Indiana published wedding notices, sometimes before the wedding and sometimes after, and occasionally with extended coverage of the ceremony. In this blog, we will take you through a few notices to give you a sense of how Indiana newspapers covered Hoosiers tying the knot.

One of the earliest wedding notices that we found came from the Vincennes Indiana Gazette on October 23, 1804, before Indiana’s statehood. During these early years of Indiana papers, the wedding notices were fairly basic, often only sharing the exact details of the wedding and nothing else. Here’s the exact text from the Indiana Gazette:

MARRIED, On Sunday evening last, Mr. John M’Gowan to the amiable Miss Sally Baltis, both of this county [Knox County].

Besides the word “amiable,” this notice contains very little information, despite the couple being local. Similar wedding notices were published in the Vincennes Western Sun in 1810 and 1814 and the Charlestown Indiana Intelligencer in 1825.

Early Indiana papers also published breaches of marriage. For example, a piece in the December 14, 1816 issue of the Western Sun  noted that a “breach of marriage promise, between Margaret Logan, plaintiff, and Rob[er]t Gray defendant, was yesterday tried in the Court of Common Pleas of this county [Knox County].” The trial resulted in a “verdict for \$1,000 [in] damages—the sum claimed in the declaration,” likely going back to Logan.

Another common tradition in the early years of wedding notices was the use of the subheading “hymeneal,” meaning “nuptial.” Sadly, one of the early uses in the Indiana Republican misspelled the word as “hymenial,” which is a type of fungus.  Nevertheless, papers like the Republican used the term during the early half of the nineteenth century, as a way to group a few wedding notices into a single piece. The Republican hymeneal from 1817 (with the misspelling) provided notices for two weddings, separated by an anonymously authored poem:

Not Eden with its shades and flowers,

Was Paradise till women smil’d; –

Then what’s this dreary world of ours,

Without creation’s loveliest child.

In an April 27, 1838 issue of the Brookville American, another Hymeneal, spelled right this time, ran on the third page. Four separate weddings from both Indiana and Ohio make up the column. One particular wedding announcement went out late, so it came with an “apology to the parties . . . that it was mislaid.”

Alongside descriptions of wedding notices, newspapers also advertised the costs of publishing a notice. An advertisement in the December 24, 1855 issue of the Indianapolis Daily State Sentinel displayed the cost of publishing a marriage notice as \$1, which in 2016 dollars amounts to \$15.92. Still a bargain, if you want people to know about your wedding.

By the 1870s and 1880s, the notices kept the same style but lost some the century’s earlier pretensions. For example, the term “hymeneal” went to the wayside, in favor of a more generic “announcements” section. This is exactly how the Indianapolis News published a wedding notice in its February 12, 1885 issue.

That’s not to say there were not outliers. One of the most interesting newspapers available in Hoosier State Chronicles is the Smithville-based Name It and Take It!. A rather obscure paper, it only ran a few months in 1897 before folding. In the June 25, 1897 issue, a wedding noticed was published under the heading of “ROMANTIC!”, the use of an exclamation point being the standard practice on nearly every piece in the notices section. “The Rev. A. S. [Alexander “Sandy”] Baker married a couple on short notice last Saturday, in the clerks [sic] office at Bloomington. The contracting parties were: John Worley, and Catherine Adams,” the paper reported. Based on the exclamation point heading, the paper wanted you to be as excited for the couple as they apparently were.

By the early 20th century, some wedding pieces became slightly more irreverent, like human interest stories you might read in your local paper. In the July 16, 1908 issue of the Richmond Palladium, an article ran entitled “Married in Shirt Waist and Skirt.”  Ted Hall, “a young business man of St. Louis,” arrived in the city, quickly proposed to “Miss Nettie Lamar,” and they were married the same day. As the paper noted, the “ceremony was set in such a short time that the bride had to be married in shirt waist and skirt.” This would be the equivalent of a young lady getting married in a pair of capris and a t-shirt today, which is quaint, even charming.

The Indianapolis News during the 1910s provided a large section of its paper to marriage notices, with notifications from all over the state. This trend continued well into the 1920s, as exemplified in an April 29, 1929 issue of the Greencastle Herald. One particular nicety that the Herald extended to the newly-wedded couples was delaying the publication of the notices, after an arrangement with the county clerk.

Other newspapers gave their wedding notice section clever titles. In a 1939 issue of the Indianapolis Recorder, the paper named its section “In Dan Cupid’s Files,” and provided nine separate notices (one was an engagement). One interesting notice noted that “Miss Ella Louise Freeman and L. C. Phelps were secretly married in Chicago” the previous March and then intended to “reside in Philadelphia.” This notice brings up so many questions. Why were they “secretly married?” What necessitated that chain of events? How did their parents feel about it? These would be great topics of research for a more in-depth analysis of wedding notices. However, that is outside the scope of this short tour.

Some wedding notices were so detailed that they warranted a front-page publication. This was the case with a notice published in the August 16, 1940 issue of the Dale News. Robert J. Lubbehusen, a U. S. Navy officer, and Miss Frances Fuchs, “second daughter of Mr. and Mrs. Ed Fuchs of St. Meinrad, Ind.” were “quietly married in the Abbey Church” in St. Meinrad. The unincorporated community of St. Meinrad houses a monastery and church for Benedictine monks. As their website describes, “Saint Meinrad Archabbey was founded in 1854 by monks from Einsiedeln Abbey in Switzerland. They came to southern Indiana at the request of a local priest who was seeking help to serve the pastoral needs of the growing German-speaking Catholic population and to prepare local men to be priests.” The small town newspaper published this notice on the first page, which was probably otherwise a slow news week. Additionally, Lubbenhusen’s active service in the Navy, roughly a year out from American involvement in WWII, may have inspired a front-page notice.

By the 1950s, photographs became a more standard practice for wedding notices in Indiana papers. The Jewish Post ran a full-page wedding notices section with mostly photographs of happily-wedded couples either leaving on their honeymoon, walking down the aisle together after the ceremony, or cutting their cake. Alongside the couples, the Post also published the names of their photographers, Miner-Baker and Julius Marx. Not only did this give credit where credit was due, but it was great advertising for the photographers. Engaged couples could see these nice photos in the paper and then follow up with Marx or Miner-Baker to have them photograph their unions. The wedding notice as advertisement represents another interesting development in Indiana wedding notices.

The last three wedding notices on this tour of history, from the 1960s, 70s, and 80s respectively, indicate that while wedding notices have changed since the beginning of Indiana’s history, they maintained a basic structure. The September 23, 1960 page of wedding notices from the Jewish Post provided the same familial and logistical information, but it also included details on the bride’s dress. The bride, Elayne Rosanne Kroot:

. . . appeared in a formal-length gown of pure silk peau de soie of ivory color, trimmed with re-embroidered hand-clipped Alencon lace highlighted by matching seed pearls and crystals forming an Empire bodice.

This notice’s level of detail contrasted the more direct, less detailed notice for another couple on the same page. (The wedding notice in the August 24, 1979 Jewish Post also displays a shorter, more direct style.) This contrast suggests a subtle distinction of class, where the longer, more detailed notice cost more to publish than the shorter notice. Again, this would be a great avenue for future research.

Our last notice page comes from the June 23, 1984 issue of the Indianapolis Recorder. These notices might be the most complete notices we will unpack in our journey. The notices are detailed, with logistical information, details on the bride’s dresses, the musical arrangements (including songs played), and a rough timeline of the entire ceremony and reception. These were also paired with photographs of the happy couples. To see the most modern representation of wedding notices, this is one of the best examples from Hoosier State Chronicles.

With that, our trip though Indiana’s wedding notices has come to an end. If you’d like to see more notices, head over to Hoosier State Chronicles.  If you search “wedding” or “married,” you get literally thousands of hits, from nearly 200 years of Indiana newspapers. There’s certainly more than a fair share of Hoosier weddings to explore.

Read the whole story
55 days ago
Interesting to see how these things evolve.
Central Indiana

## Check out: ‘Double King’, a fantastic animated shortby David Malki Saturday April 22nd, 2017 at 7:41 AM

2 Shares

I absolutely loved this new animated short by Felix Colgrave, “Double King”:

“Double King” on YouTube

It’s well worth the nine minutes to watch. Just stunning animation (and sound). It’s crafted with a level of precision, but also whimsy, that mesh in surprising and fascinating ways.

BONUS LINK: Felix Colgrave has an entire YouTube channel of prior work for ADDITIONAL HOURS OF ENJOYMENT

Read the whole story