Work: Contract for Data Wrangler at Open Knowledge

Hi,

Open Knowledge are looking for a data wrangler to work on a range of projects.

I don’t think it is published anywhere yet, I’m just posted as it is partly related to a project I’m working on.

This is contract work that would be for an average of about 4 days a week.

You can contact me if interested.

Below is a brief description, and sorry that the formatting got ruined but I can’t be bothered fixing it in this RTL editor :frowning:

What: working on cutting edge data-driven high impact open knowledge projects - for example, in the areas of government spending or health-care - with a world-leading non-profit.

For this role we would accept 2 very different types of candidate:

High-performing expert in “small-to-medium” data wrangling (e.g. you’ve been scraping in python
for years, have a deep love for CSV etc)
Novice with clear ability to learn and develop rapidly (you only vaguely know coding but want to learn and will be able to show you can learn (very) fast)

In both cases candidate must be a self-starter, capable of working remotely and taking initiative as well as working effectively in a team with others.

Skills and experience wanted - expert

Python and preferably some experience in one other language (including nodejs or “new” languages like go)
ETL (in code)
Scraping web data (using code)
Cleaning, validating messy data (and good familiarity with “small data” formats e.g. CSV, Excel, PDF etc)
(This will impress us) You’ve build your own mini-framework for at least one of these tasks and can explain why you did it, what you learnt and how it compares to other available f/oss tools
Know what OLAP is and when you might use it
Knowledge of SQL and especially PostgreSQL
Know what map-reduce is, know of (preferred: familiarity with) related tools such as Hadoop and know if/when you would use this (but we probably won’t be using this much)
General
Good experience with version control tools preferably git and github
[Expert] Agile, sprint-oriented development process
[Expert] Familiarity with free/open-source software and its processes (bonus points if you have run or contributed to an f/oss project)

Skills and experience wanted - Novice

You are smart (and there’s evidence of this - for example you graduated from a top university with a first or high second)
Note you do not have to have done a degree in anything “tech-related” - we welcome applications from English or Politics graduates as much as from Maths or Engineering (what we want is ability and the willingness and capability to learn with an open mind)
Demonstrated ability to learn very fast - relish and thrive on challenge
Use of a spreadsheet (e.g. Excel, Google Docs) including formulae (plus for macros)
Bonus: you have done some limited programming, you run your own blog, and have maybe had to dig into some HTML at some point
[Can we relate this to School of Data courses]

General (for both)
With 3-4h timezone of London (we’d love to be more flexible but in our experience it is hard to work effectively with someone further afield)
You can take the initiative and relish challenge and the ability to work in a self-directed manner in harmony with a broader team
[Highly Desirable] Prior experience working effectively with remote teams

Form of engagement
Likely length of the engagement: ongoing, can be flexible in terms of time commitment but looking at at least 4d a week (on average - so you can take out weeks or days and make up at other points)
Start date: asap
Form of engagement: this will be a contracting role (though possibly of fixed term employment if you are UK-based)

Test: http://wiki.okfn.org/Get_The_Data_Challenge