Sponsored links

Go Back   MP3Car.com > Mp3Car Technical > Crowdsourced Data Collection, Open Source Mapping And Routing


Reply
 
Share Thread Tools Display Modes
Old 07-08-2009, 02:38 PM   #1
Mp3Car Staff
 
ecog's Avatar
 
Join Date: Aug 2005
Posts: 88
Blog Entries: 9
ecog is on a distinguished road
Data Privacy HELP!

As people submit their tracks, data privacy will be a huge issue.

Some people don't care if everybody on the planet knows where they go and when, this would be you Bugbyte and some people like to keep their whereabouts private, for example super secret international spies who would like to contribute to the project.

All jokes aside, if we don't address privacy issues many people will not be comfortable contributing.

I'm not sure how to approach this. On one hand it is just a collection of tracks on a server that have no identifiable information about anybody (beyond lat/long location).

Most of us wouldn't go through the effort to analyze data to use it for anything other than map generation. But then again I'd rather not be responsible for people's safety if somebody does find a way to figure out/track a specific contributor.

So the question is, once the data is uploaded, how do we make it public and usable to anybody in the world without compromising the identity of the contributors?

The only solution so far that comes to mind is stripping the beginning and trailing waypoints of each track. For example, if I have a track from my house to my friend's house and that track consists of 100 waypoints, I would strip the first and last 10 waypoints so anybody else would see that the track starts in the middle of a public street and ends in a public street.

Let me know what you think.
ecog is offline   Reply With Quote
Advertisement
 
Advertisement
Sponsored links

Old 07-08-2009, 02:44 PM   #2
North of the land of Hey Huns
 
Join Date: Jun 2004
Location: Westminster, MD
Posts: 993
malcom2073 is a name known to allmalcom2073 is a name known to allmalcom2073 is a name known to allmalcom2073 is a name known to allmalcom2073 is a name known to allmalcom2073 is a name known to all
While I think the entire argument is absolutely silly (along the lines of "oh my god they're out to get me" kind of silly), I agree that stripping a user configurable number of waypoints, or perhaps a user configurable distance from start and end might be the way to do it. Eg if the points are within say, a mile of a persons "secret" locations, then don't log or include them.
__________________
RevFE - Try it, you just might like it.
Carbon - Next Generation Touchscreen Browser
Come join us on IRC: irc.efnet.net #mp3car
Audiophiles make me chuckle as they pad my wallet.
malcom2073 is offline   Reply With Quote
Old 07-08-2009, 02:52 PM   #3
Variable Bitrate
 
thekl0wn's Avatar
 
Join Date: Apr 2005
Location: PoCo, Indiana
Posts: 245
thekl0wn is on a distinguished road
You mean this isn't going to be a stalker's paradise?

X-amount of seconds could automatically be taken off of each track at/by the server.

Once data starts being loaded, processed, and placed onto the server/database, I would think that it should be a fairly easy process to edit out a block of points. For instance, if user-X doesn't want everyone to know when he goes to the strip club, he should be able to set up parameters which disregard any of his/her uploaded waypoints within those boundaries.
__________________
Planning [----X-----] 40%
Programming [-X-------] 20%
Parts [-----X----] 50%
Install [X--------] 5%

See Me In A Pink Skirt
thekl0wn is offline   Reply With Quote
Old 07-08-2009, 03:05 PM   #4
Maximum Bitrate
 
soundman98's Avatar
 
Join Date: Jan 2008
Location: on the border of northern IL/IN
Posts: 848
soundman98 has a spectacular aura aboutsoundman98 has a spectacular aura aboutsoundman98 has a spectacular aura about
i agree that there needs to be some sort of protection to those that would contribute to the maps, the biggest problem i see is the valuable data that is lost by taking out a specific range (stripping out even a quarter mile from my house would leave alot of roads unmarked) i think the best way to acomplish this would be to have the user themself decide on what data to include/remove. what if part of the software that uploads the gps info gives the option to the person to add/remove certain areas?

this way, anyone who doen't want to be tracked can determine at what point they don't want to contribute-- this would also come in handy for those that have very long driveways, that a gps could possibly interpet as being a cross road...

i could also see a issue (at first at least- with minimal people being tracked) where having a circular gap in the map(for not tracking a certain distance from your house) would be just as easy for someone who wanted to find that person...
soundman98 is offline   Reply With Quote
Old 07-09-2009, 11:48 AM   #5
Admin. Don't bug or I'll byte.
 
Bugbyte's Avatar
 
Join Date: Sep 2004
Location: Corning, NY
Posts: 6,096
Bugbyte is a splendid one to beholdBugbyte is a splendid one to beholdBugbyte is a splendid one to beholdBugbyte is a splendid one to beholdBugbyte is a splendid one to beholdBugbyte is a splendid one to beholdBugbyte is a splendid one to behold
How about simply the option to set or unset that user-configurable distance or time? Leave it up to the user as to how much of it to expose. I'm not real worried about someone figuring out where I live. That's pretty dead-simple to do.

And if I upload in non-real time, they can't be sure when I'll be there or not.

I do think an option to divorce the track data from user identification should be allowed. That way, no agency of the gov't could use it to issue you a speeding ticket, for example. Or, more likely, to use it to track a suspect in an investigation.

I'm not a conspiracy theorist, but I'd hate to see it used against you simply because your data showed you in the vicinity of a crime at the time that it occurred.
__________________
Want to:
-Find out about the iBug?
-Stop being a newbie? Take a look at the FAQ Emporium?
-Find out about carPC's in just 5 minutes? View the Car PC 101 video
-Help me kill my car PC
-Watch live video streams from my mobile PC? Check it out here.
-Where is the iBug?
Bugbyte is offline   Reply With Quote
Old 07-09-2009, 12:07 PM   #6
Variable Bitrate
 
thekl0wn's Avatar
 
Join Date: Apr 2005
Location: PoCo, Indiana
Posts: 245
thekl0wn is on a distinguished road
Conspiracy theorist, no... That's not what I'd consider your opinion on the topic. I'd consider it more being smart about the liability of the project as a whole!
__________________
Planning [----X-----] 40%
Programming [-X-------] 20%
Parts [-----X----] 50%
Install [X--------] 5%

See Me In A Pink Skirt
thekl0wn is offline   Reply With Quote
Old 07-09-2009, 01:34 PM   #7
Mod - OBDII GPS Logger forum
 
Join Date: Mar 2009
Location: Los Angeles
Posts: 380
chunkyks is on a distinguished road
http://www.hhs.gov/ohrp/humansubject...ce/45cfr46.htm

This is in the realm of protection of human subjects [or HSPC as they call it where I work]. I don't think this question is paranoid, or conspiracy-ish, or anything like that - at my place of work, this is of pivotal importance... And one thing specifically on our list of things to be de-identified is GPS co-ordinates.

Quote:
I do think an option to divorce the track data from user identification should be allowed. That way, no agency of the gov't could use it to issue you a speeding ticket, for example. Or, more likely, to use it to track a suspect in an investigation.

It shouldn't just be "allowed", it should be *forced*. Data should *always* get de-identified, no matter what.

Obviously searching on slashdot leads to a series of paranoid conspiracy theories, but I do find it a decent clearinghouse of legitimately useful links on this very topic. site:slashdot.org gps tax

This topic is so much more important than it's currently assigned. I realise that I'm treading a deadly ground with obdgpslogger in this regard, but I made a pre-meditated design decision a long time ago to *not* attach any identifying information at all to the database. I normalise a lot of data exported to google earth, and I think I will, in future, also provide an option to normalise cvs data, or even normalise data going into the database.

Gary (-;
chunkyks is offline   Reply With Quote
Old 07-09-2009, 01:58 PM   #8
darth sidious lite
 
Fiberoptic's Avatar
 
Join Date: Jul 1978
Location: Baltimore, MD
Posts: 1,156
Blog Entries: 113
Fiberoptic will become famous soon enoughFiberoptic will become famous soon enough
Quote: Originally Posted by chunkyks View Post
http://www.hhs.gov/ohrp/humansubject...ce/45cfr46.htm

This is in the realm of protection of human subjects [or HSPC as they call it where I work]. I don't think this question is paranoid, or conspiracy-ish, or anything like that - at my place of work, this is of pivotal importance... And one thing specifically on our list of things to be de-identified is GPS co-ordinates.



It shouldn't just be "allowed", it should be *forced*. Data should *always* get de-identified, no matter what.

Obviously searching on slashdot leads to a series of paranoid conspiracy theories, but I do find it a decent clearinghouse of legitimately useful links on this very topic. site:slashdot.org gps tax

This topic is so much more important than it's currently assigned. I realise that I'm treading a deadly ground with obdgpslogger in this regard, but I made a pre-meditated design decision a long time ago to *not* attach any identifying information at all to the database. I normalise a lot of data exported to google earth, and I think I will, in future, also provide an option to normalise cvs data, or even normalise data going into the database.

Gary (-;

I believe that in order to properly weight, ignore or qualify the tracks being uploaded it would be essential to be able to track the user that submited the trail.

At the same time I completely agree with the need to protect privacy. We never should be in a position where we even have the ability to provide data for a subpoena.

Is there technology that we could borrow from the medical or cryptography world to allow us to weight the inbound gps streams and still maintain privacy?

I talk more about the need for weighting the quality of the upload here. Here is an excerpt. This link has more details.
Quote: Originally Posted by Fiberoptic View Post
The algorithms would also be smart enough to possibly throw at anomalies. Let's just say for example that I am probe. I report with my iPhone. I regularly bike the wrong direction on one-way streets and speed 20 miles over the speed limit. The algorithm would eventually throw out certain parts of my data that are way outside the norm and negatively weight all of my other reports.


Last edited by Fiberoptic; 07-09-2009 at 01:59 PM.. Reason: excerpt
Fiberoptic is offline   Reply With Quote
Sponsored links
Advertisement
 
Advertisement
Old 07-09-2009, 02:14 PM   #9
Mod - OBDII GPS Logger forum
 
Join Date: Mar 2009
Location: Los Angeles
Posts: 380
chunkyks is on a distinguished road
Quote:
Is there technology that we could borrow from the medical or cryptography world to allow us to weight the inbound gps streams and still maintain privacy?

What we do here is have "cold rooms". PCs airgapped from the outside world, where linking tables are created. Data is split into two tables, one mapping the identifying data to an opaque id [usually generated with some kind of function from the other data in the row - eg, it might be SSN+DoB mangled in a specific way]. This table is stored where no-one can get it. The other table maps that opaque row ID to the actual data, and is the one that's actually copied out of the cold room and operated on.

Of course, this is technically subpoenable I believe. It's also not necessarily feasible for this scenario. I will ask around at work for some suggestions - there's people here who've been dealing with HSPC for literally decades.

Gary (-;
chunkyks is offline   Reply With Quote
Old 07-14-2009, 05:15 PM   #10
Maximum Bitrate
 
soundman98's Avatar
 
Join Date: Jan 2008
Location: on the border of northern IL/IN
Posts: 848
soundman98 has a spectacular aura aboutsoundman98 has a spectacular aura aboutsoundman98 has a spectacular aura about
Gary, would you happen to have a update for us on the best way to approach this?
(i have started logging, but am hesitant to upload until the privacy issues are worked out)
soundman98 is offline   Reply With Quote
Old 07-15-2009, 11:31 AM   #11
Variable Bitrate
 
thekl0wn's Avatar
 
Join Date: Apr 2005
Location: PoCo, Indiana
Posts: 245
thekl0wn is on a distinguished road
Same boat here, as far as starting logging...

Also, before I finalize a release of even the simplest of user apps, I'd need to know what can/can't be used.

And on the topic of that, I'm working on that app, in hopes that I can specify a singular lat/lon point, and then filter out a square chunk of data within X-distance of it. (and have multiple points like this, stored locally on the user's computer)
__________________
Planning [----X-----] 40%
Programming [-X-------] 20%
Parts [-----X----] 50%
Install [X--------] 5%

See Me In A Pink Skirt
thekl0wn is offline   Reply With Quote
Old 07-21-2009, 01:56 PM   #12
darth sidious lite
 
Fiberoptic's Avatar
 
Join Date: Jul 1978
Location: Baltimore, MD
Posts: 1,156
Blog Entries: 113
Fiberoptic will become famous soon enoughFiberoptic will become famous soon enough
Quote: Originally Posted by chunkyks View Post
I will ask around at work for some suggestions - there's people here who've been dealing with HSPC for literally decades.

Chunkyks, Did anyone from work give you any ideas? If not, who would be the expert to ask about this? We might be able to get them to solve the problem for free as part of a blog post.
Fiberoptic is offline   Reply With Quote
Old 07-21-2009, 02:31 PM   #13
Mod - OBDII GPS Logger forum
 
Join Date: Mar 2009
Location: Los Angeles
Posts: 380
chunkyks is on a distinguished road
Sorry, this completely slipped my mind. <walks off for twenty minutes and talks to some coworkers>


After some thought and discussion, the root problem is that we have two mutually incompatible requirements:

1) The question we explicitly want to be unable to answer is:
"Given a user, X, which traces belong to that user?"
2) The question we explicitly *do* want to be able to answer is:
"We're trying to audit user X, to find out how many miles they've uploaded to the database"

One guy I work with said that in the past, he's used trapdoor hashes on the uniquely identifiable IDs. eg, it was a large database of drivers license numbers, with sexual habits and STDs for those individuals. What he did was push all the drivers license numbers through a trapdoor function. This left him with a dataset that he couldn't use to uniquely identify the people involved, but anyone with the original database ["what are the sexual habits and STDs of the person with drivers license number XXX"] could answer.

Of course, this is backward from us [that model explicitly was able to answer the question "Given a user, X, what traces belong to that user?"], but I think there's some potential there, for ideas of hashes and stuff.

Perhaps it would be useful to hash each actual trace, and attach the hash to a user. That way you'd be able to answer the question, for each trace in the system, which user uploaded it. That would make auditing all the users in the system doable on occasion [depending on number of traces in the sytem].

Another option that I had been considering was converting the data upon upload. Convert traces to just lat/lon/alt traces, and make that identifiable to a user. We'd need some trusted way to do this, though, and trust in computers is hard to come by. Showing that someone has, at some point, been at a location is a lot less useful than showing when they were there. But still, that's quite a thing for someone to be able to use as subpoenable evidence. Alternatively, just a miles conversion... the problem is a lack of auditability.

So, that's a couple ideas. Discussion?

Gary (-;
chunkyks is offline   Reply With Quote
Old 07-21-2009, 03:01 PM   #14
Maximum Bitrate
 
soundman98's Avatar
 
Join Date: Jan 2008
Location: on the border of northern IL/IN
Posts: 848
soundman98 has a spectacular aura aboutsoundman98 has a spectacular aura aboutsoundman98 has a spectacular aura about
alright, i got a little lost in some of it (i don't understand hashes), but here is my new idea (i think this is similar to what you are implying)

instead of using/tracking specific users, what if we were to allow the computer to track user numbers-- ie. everyone is assigned a user id #, but no personal info is required to get a number/ or is saved on the upload, just the user id. in theory, this would separate the critical data from the gps data, an would make it harder for anyone to find anyone.(unless you know who is assigned a specific user id)

using this idea, anyone that would want to put a image in their sig for how many tracks they uploaded would just need to link the " 'user id #' = 'tracks uploaded' ".

i realize that nothing is that simple, but thats the idea...right?
soundman98 is offline   Reply With Quote
Old 07-21-2009, 05:03 PM   #15
Mod - OBDII GPS Logger forum
 
Join Date: Mar 2009
Location: Los Angeles
Posts: 380
chunkyks is on a distinguished road
Simple is good, yes. There's two problems; one, there's still a linking table in a database somewhere, that pretty much voids any data privacy stuff.

The second problem is that if you don't have that linking table, how do you know that users aren't gaming the system? You need to be able to verify how much gps track [eg, in miles] users have uploaded.

Gary (-;
chunkyks is offline   Reply With Quote
Sponsored links
Advertisement
 
Advertisement
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
iBusRR v2.0.5: 06/01/2008 BMW Doors/Windows/Lights control NEW RELEASE!! WaRFiElD RR Released Plugins 384 10-14-2009 11:44 AM
Renault "Tuner List" Head Unit/CD changer hacking - Controls Vicne Hardware Development 790 08-19-2009 02:55 PM
Can you guys please help me??? OBD Renault Clio help needed! madtoonbull Engine Management, OBD-II, Engine Diagnostics, etc. 7 02-19-2009 11:36 AM
Pinout Color Codes / Tables gummybear General Hardware Discussion 4 05-12-2005 04:05 AM
Article: Privacy of vehicle information being debated VanMan69 General MP3Car Discussion 10 03-15-2005 03:59 AM



All times are GMT -5. The time now is 06:21 PM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.3.0
Copyright © 1999 - 2008 Mp3Car.com Inc.
"VaultWiki" powered by VaultWiki v2.5.2.
Copyright © 2008 - 2009, Cracked Egg Studios.Ad Management by RedTyger
Message Board Statistics