Chargement...
 

Historique: Data Sets

Aperçu de cette version: 1


GitHub - curran/data: A collection of public data sets (as of Jan 15, 2016)


This repository

curran

/
dataImage
Code Issues 2 Pull requests 0 Pulse Graphs A collection of public data sets
  1. HTML97.9%
  2. JavaScript1.8%
  3. Other0.3%
HTMLJavaScriptOther New file Find file

HTTPS Choose a clone URL HTTPS (recommended) Clone with Git or checkout with SVN using the repository's web address. HTTPS Learn more about clone URLs Download ZIP Branch:gh-pagesSwitch branches/tags
gh-pages master Nothing to showNothing to show New pull request Latest commit 22c9c2b Jan 12, 2016Image curranClean up READMEPermalink
Failed to load latest commit information.
Image RdatasetsAdd R datasetsOct 25, 2015
Image airbnbAdd Airbnb data setAug 4, 2015
Image allAdd data from data soup meetupAug 30, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Add pincodes datasetDec 13, 2015
Image appliedPredictiveModelingAdd applied predictive modeling data setsAug 5, 2015
Image bokehAdd Bokeh examplesOct 25, 2015
Image calcAdd calc s70 dataOct 30, 2015
Image cdcAdded full table data module with unindented causes for the entire ca…Feb 19, 2014
Image correlatesofwarAdd correlates of war dataOct 30, 2015
Image d3ExamplesAdd updated unemployment timeseries dataAug 7, 2015
Image data.gov.inAdded data sets from data.gov.inApr 21, 2014
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Add data.gov dataAug 5, 2015
Image dataSoupAdd data from data soup meetupAug 30, 2015
Image datalibExamplesAdd datalib examplesAug 3, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Fixed bug where population was not showing up in CSV file for Geoname…Apr 29, 2015
Image dcjsAdd data sets from DC.jsAug 4, 2015
Image dsplAdded countries list from GoogleApr 1, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Added Africa undernourishment data setDec 1, 2014
Image fbiAdd stub for FBI crime datasetDec 7, 2015
Image gapminderUpdate READMEAug 31, 2015
Image geonamesFixed bug where population was not showing up in CSV file for Geoname…Apr 30, 2015
Image integratedAdded population vs. gdp data setApr 28, 2015
Image ipoAdd note on IPO calue columnAug 5, 2015
Image jsLibrariesAdded js lib data setApr 1, 2015
Image mattermarkRemoved funky characters in CSVAug 18, 2015
Image medicalStoreChallengeAdd data from medical store challengeAug 4, 2015
Image migrantsAdd data from data soup meetupAug 30, 2015
Image motherjonesAdd mother jones shooting dataOct 30, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Added data about bachelors degrees fron NSFFeb 12, 2014
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Update README.mdDec 10, 2015
Image oecdAdd house price dataAug 5, 2015
Image olpcFixed imageApr 27, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Add house numbers in MontrealAug 3, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Add processed small data setsOct 19, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Add William Playfair trade dataSep 6, 2015
Image plotlyExamplesAdd fertility-rates-in-south data setAug 3, 2015
Image senseYourCityAdded unit test framework, added test for iris dataset parsing using …Jul 31, 2015
Image slavevoyagesAdd slave voyages dataOct 30, 2015
Image statCounterRemove .DS_Store Mac turdAug 27, 2015
Image superstoreSalesAdded superstore sales example dataSep 2, 2014
Image syntagmaticAdd data sets from syntagmaticAug 4, 2015
Image tuskegeeInstituteAdd links to referencesNov 24, 2015
Image tweetsMerge branch 'gh-pages' of github.com:curran/data into gh-pagesNov 30, 2015
Image uci_mlAdd cleaned avian flu dataOct 28, 2015
Image unAdd UN Data extractOct 19, 2015
Image undpAdd undp data setsAug 6, 2015
Image unhcrAdd note on refugees dataNov 23, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Add cleaned avian flu dataOct 28, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Added filtered versions for earthquake dataApr 27, 2015
Image utilAdded earthquake dataApr 27, 2015
Image uwdata_voyagerAdd Voyager data setsAug 5, 2015
Image vegaExamplesAdd copy of vega example data setsAug 3, 2015
Image w3schoolsAdded stub for scraping w3schools browser market share dataApr 3, 2014
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Add WSJ data setNov 14, 2015
Image wikibonAdd Big Data Vendor data from WikibonAug 12, 2015
Image
 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Update sm.pop.refg_Indicator_en_csv_v2.csvNov 20, 2015
Image worldFactbookAdded world factbook dataAug 14, 2013
Image .gitignoreAdded unit test framework, added test for iris dataset parsing using …Aug 1, 2015
Image Interest Group Spending 2000-2016.csvCreate Interest Group Spending 2000-2016.csvJan 11, 2016
Image LICENSEAdd MIT Licence for#2Nov 13, 2015
Image README.mdClean up READMEJan 12, 2016
Image package.jsonAdded unit test framework, added test for iris dataset parsing using …Aug 1, 2015
Image test.jsAdded unit test framework, added test for iris dataset parsing using …Aug 1, 2015

README.md

 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
data


A collection of public data sets for testing out visualization methods. These data sets are at various stages of preparation, some are just raw data, some are CSV files, and some are exposed as AMD modules. This collection is messy, but with some digging you may find hidden gems.

 Plugiciel désactivé
Le Plugiciel aname n'a pas pu être exécuté.
Targets for import:


Here's a listing of data sets with more detail. Columns will be marked in terms of their type for visualization, including:

Q = Quantitative, continuously varying numeric columns

T = Temporal, a timestamp

O = Ordered, distinct categories with a natural order (e.g. Low, Medium, High)

N = Nominal, distinct categories with no natural order (e.g. Ethnicity)

G = Geospatial identifiers (e.g. Country, City)



UCI Machine Learning Repository - Adult (3.8 MB)


This data set demonstrates a mix of quantitative, ordinal, and nominal columns. To analyze this data set using visualization, it would be useful to aggregate the data on the fly before visualization.

  • age: Q
  • workclass: N
  • education: O
  • education-num: Q
  • marital-status: N
  • occupation: N
  • relationship: N
  • race: N
  • sex: N
  • capital-gain: Q
  • capital-loss: Q
  • hours-per-week: Q
  • native-country: N

Data Canvas Sense Your City (237MB or Real-time API)


This data set contains measures collected by DIY sensor kits across several major cities %22San Francisco%22, %22Bangalore%22, %22Boston%22, %22Geneva%22, %22Rio de Janeiro%22, %22Shanghai%22, %22Singapore%22. There is a visualization competition for this data set, submissions due March 20.

  • city: G
  • timestamp: T
  • temperature: Q
  • light: Q
  • airquality: Q
  • sound: Q
  • humidity: Q
  • dust: Q

Medical Store Geospatial Challenge (< 100KB)


This is a data set is small, but comes with a set of real-world questions about the data. This is also a competition, with submissions due April 25.

Referrers - Each row corresponds to information on a particular client referral source.

referrer_code: N

  • visit_count: Q
  • city — referrer city
  • postal_code_referrer: G
(latitude, longitude): G

Clients - Each row corresponds to a client visit to the store

client_id: N

  • referrer_code: N
  • city — referrer city
  • postal_code_referrer: G
  • (latitude, longitude): G
  • initial_visit_date: T
  • product_count: Q

UCI Machine Learning Repository - Individual household electric power consumption (20 MB)


This data set would be a great candidate to show multi-scale temporal aggregation.

  • timestamp: T
  • global_active_power: Q
  • global_reactive_power: Q
  • voltage: Q
  • global_intensity: Q

BrightKite User Check-ins (57.2 MB)


This data set would be a useful example for multi-scale aggregation in both space and time. This has been used as the motivating example for several Big Data visualization systems based on data cubes (imMens: Real‐time Visual Querying of Big Data, Nanocubes for real-time exploration of spatiotemporal datasets).

  • user-id: N
  • timestamp: T
  • (latitude, longitude): G

ACLED (Armed Conflict Location and Event Data Project) (35MB)


This data set contains entries for each violent event in Africa from 1997 - 2014. This data set would be a good candidate for visualization with a linked timeline and choropleth map, where selections in the timeline can drive the filtering of data shown on the map.

  • timestamp: T
  • (latitude, longitude): G
  • country: G
  • number of fatalities: Q

Safecast (3.2GB)


Grassroots sensor data about nuclear radiation in Japan


Statistical Computing Statistical Graphics Data expo Airline on-time performance (12GB)


A great data set for scalability testing. This is the data set used in the Crossfilter Demo.


The GDELT Data Set (~100GB)


This would be a great data set for more extreme scalability testing. There is an Open Source project for loading this data set into Spark on AWS.


The Indian Census has lots of public data.


Best Buy has a developer portal for querying their data via a Web API.

https://github.com
Something went wrong with that request. Please try again. You signed in with another tab or window. [|Reload] to refresh your session.You signed out in another tab or window. [|Reload] to refresh your session.

Historique

Avancé
Information Version
ven. 15 de Jan, 2016 15h34 ggrefens from 129.175.15.11 4
Afficher
ven. 15 de Jan, 2016 15h33 ggrefens from 129.175.15.11 3
Afficher
ven. 15 de Jan, 2016 15h31 ggrefens from 129.175.15.11 2
Afficher
ven. 15 de Jan, 2016 15h28 ggrefens from 129.175.15.11 1
Afficher