this repo has no description
4
fork

Configure Feed

Select the types of activity you want to include in your feed.

:art:

+3
+3
Open Data.md
··· 51 51 - Support for integrating non-dataset files. A dataset could be linked to code, visualizations, pipelines, models, reports, ... 52 52 - **Reproducible and Verifiable**. People should be able to trust the final datasets without having to recompute everything from scratch. In "reality", events are immutable, data should be too. [Make datasets the center of the tooling](https://dagster.io/blog/software-defined-assets). 53 53 - With immutability and content addressing, you can move backwards in time and run transformations or queries on how the dataset was at a certain point in time. 54 + - [Datasets are books, not houses]()! 54 55 - **Permissionless**. Anyone should be able to add/update/fix datasets or their metadata. GitHub style collaboration, curation, and composability. On data. 55 56 - **Aligned Incentives**. Curators should have incentives to improve datasets. Data is messy after all, but a good set of incentives could make great datasets surface and reward contributors accordingly (e.g: [number of contributors to Dune](https://github.com/duneanalytics/spellbook/commits/main)). 56 57 - [Bounties](https://www.dolthub.com/bounties) could be created to reward people that adds useful but missing datasets. ··· 83 84 - To avoid yet another open dataset portal, build adapters to integrate with other indexes. 84 85 - For example, integrate all [Hugging Face datasets](https://huggingface.co/docs/datasets/index) by making an scheduled job that builds a Frictionless Catalog (bunch of `datapackage.yml`s pointing to their parquet files). 85 86 - [Expose a JSON-LD so Google Dataset Search can index it](https://developers.google.com/search/docs/appearance/structured-data/dataset). 87 + - [FAIR](https://www.go-fair.org/fair-principles/). 86 88 - **Formatting**. Datasets are saved and exposed in multiple formats (CSV, Parquet, ...). Could be done in the backend, or in the client when pulling data (WASM). The package manager should be **format and storage agnostic**. Give me the dataset with id `xyz` as a CSV in this folder. 87 89 - **Social**. Allow users, organizations, stars, citations, attaching default visualizations (d3, [Vega](https://vega.github.io/), [Vegafusion](https://github.com/vegafusion/vegafusion/), and others), ... 88 90 - Importing datasets. Making possible to `data fork user/data`, improve something and publish the resulting dataset back (via something like a PR). ··· 256 258 - [Carbon Plan](https://github.com/carbonplan) 257 259 - [Data is Plural](https://github.com/data-is-plural) 258 260 - [Data Liberation Project](https://github.com/data-liberation-project) 261 + - [Opendatasoft](https://www.opendatasoft.com/) 259 262 260 263 ### Indexes 261 264