This is the P2PU Archive. If you want the current site, go to www.p2pu.org!

Open Journalism & the Open Web

Week 4: Audio and chat log

Danielle Fankhauser's picture
Mon, 2010-10-11 20:09

Audio: http://apps.calliflower.com/recording/download/886

    

 

 
Mai Hoang says:    
great stuff. a lot to explore

picture


Steve Myers says:    
later

picture


david mason says:    
thank you !

picture


Mariano Blejman says:    
thanks!

picture


Marlon x says:    
veryinteresting, thanks

picture


Tim Groves says:    
thanks!

picture


Richard Conniff says:    
Good stuff

picture


Chris Nicholson says:    
The sound was crystal clear this time!

picture


Phillip Smith says:    
Outline of assignment: 

+ Pick a subject matter you want to investigate
+ Identify a dataset or datasets that will help you formulate your story. For this exercise, only pick a dataset that is already available on the Web, e.g. via Data.gov or a state- or city-level data website. Plan:
- What you need to do to clean these data, e.g. remove columns, perform functions on other columns, remove special characters ...
- The schema you'll make to house the dataset(s) — column names, column types
- What are you doing with this data — Are you proving an existing thesis (and if so, what is it) or are you seeking to better understand the subject matter (and if so, what questions will this data answer)?
- What will your queries look like? Are you going to join multiple databases together? If so, how, and why will the results of the join be accurate and relevant? Hackers: If possible, write out a query in SQL.
- How will you express the results of your inquiry? In the text of an article? As a chart or graph? Informing a search for relevant photos and video?
- What questions won't the data answer that you want to address in your project? Who will you turn to as you start looking for those answers?


picture


Phillip Smith says:    
Or, if you're Canadian:

picture


Phillip Smith says:    
Where to find tools for this week's assignment:http://toolkit.snd.org/ & http://help.hackshackers.com are great places to start your search.

picture


Phillip Smith says:    
@David -- it is a bit of a curious service; thankfully, we don't have access to it from Canada! :)

picture


david mason says:    
it's oesn't offer anything like a sustainable wage

picture


david mason says:    
I have moral issues with Mechanical Turk

picture


Michael Newman says:    
@Nicholas: Yea definitely a problems there. But you are only giving a little information to each person. Especially if speed is of the essence.

picture


david mason says:    
right now i'm quite keen on and would appreciate feedback onhttp://canbudget.zooid.org/wiki/David_Mason/P2PU_week_1_assignmentwhich describes importing the G20/G8 budget via crowdsourcing and trying to make it presentable, participatory and reusable

picture


Nicholas Judd says:    
Michael - True, but I'd be concerned about releasing my source materials pre-publication if I used Mechanical Turk.

picture


Phillip Smith says:    
@David - Awesome

picture


Michael Newman says:    
You had mentioned mechanical turk earlier. Amazon has a service you can pay people to do some of that.

picture


Chris Nicholson says:    
Hundreds of pages about fire hydrants? Oh my word... :\

picture


sarahheartsthenews says:    
a colleague and I sorted through child protection service records in Texas to identify child care centers that were consistently performing poorly -- like repeated cases of abuse, health hazards -- and what they were doing to improve/ if the state was being proactive enough and holding them accountable. I think we just used Excel and came up with a way to sort the records by certain key words

picture


david mason says:    
it was years ago, just have a presentation now.http://zooid.org/~vid/presentations/publicwhip/

picture


Phillip Smith says:    
@David -- Do you have the URL for that example? The Hansard data ?

picture


Chris Nicholson says:    
On the subject of OCR, I guess a lot of journalists get hard copies (paper copies) of important documents, that need read by OCR for being converted to digital data? That's a question to everyone, I guess.

picture


david mason says:    
there's a real movement around this worth tapping into

picture


david mason says:    
scraped the Canadian Hansard but a year later they opened it up in a structured format.

picture


david mason says:    
I prefer text :)

picture


david mason says:    
although there was ocr version of the text available too

picture


david mason says:    
Steve Myers, no it was only 18 pages so a few people did it "crowdsourcing"

picture


Steve Myers says:    
@david, you used Mechanical Turk for that?

picture


david mason says:    
we did the same thing for canbudget with *faxed* government data

picture


Richard Conniff says:    
what program do you use to convert out of pdf to machine readable format

picture


Steve Myers says:    
Ah, good point.

picture


david mason says:    
Steve Myers, but the missing data becomes more evident and there are many successful cases of requesting data

picture


Steve Myers says:    
Interesting point on APIs there, that they will only help you get at the info that the agency wants you to get at.

picture


david mason says:    
wikileaks

picture


Phillip Smith says:    
Other questions?

picture


Phillip Smith says:    
Steve, Sarah, Tim: great, thanks! In the queue.

picture


Tim Groves says:    
I am interested in learning ways of acquiring data. You mentioned using API, can you go over how that works, and other ways to go about acquiring data?

picture


Phillip Smith says:    
@Mariano: Shoot me an email and I'll try to help. :)

picture


Sarah Laskow says:    
Nick, have you used google refine/freebase gridworks at all?

picture


Mariano Blejman says:    
I've lost my partner too... :-(

picture


Steve Myers says:    
Can you describe how you'd use Socrata and what it would tell you?

picture


Phillip Smith says:    
:)

picture


Phillip Smith says:    
Folks: Please queue up your questions for Nick & post them here, or use the raise hand feature.

picture


Phillip Smith says:    
@David -- Heh.

picture


david mason says:    
pedantic: i bet you use vim, not vi :) http://www.vim.org/

picture


Phillip Smith says:    

picture


Chris Nicholson says:    
Thank you, Phillip.

picture


Phillip Smith says:    
@Chris: If you can shoot an e-mail to me, Dani, or Sarah, we can try to help out with that. :)

picture


Chris Nicholson says:    
Anyone lost their course partner? I have! :-/

Comments

These are the links to the

Sarah Chacko's picture
Sarah Chacko
Mon, 2010-10-11 20:15

These are the links to the stories my colleague and I wrote from the data we collected: http://www.dentonrc.com/sharedcontent/dws/drc/localnews/stories/DRC_Chil...
http://www.dentonrc.com/sharedcontent/dws/drc/localnews/stories/DRC_Pric...
Nothing major but it was an interesting process ...

Great stuff, thanks! :)

Phillip Smith's picture
Phillip Smith
Mon, 2010-10-11 20:47

Great stuff, thanks! :)