html_url,id,node_id,tag_name,target_commitish,name,draft,author,prerelease,created_at,published_at,assets,body,repo
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.3,8556054,MDc6UmVsZWFzZTg1NTYwNTQ=,0.3,master,csvs-to-sqlite 0.3,0,9599,0,2017-11-17T05:26:07Z,2017-11-17T05:33:39Z,[],"- **Mechanism for converting columns into separate tables**
Let's say you have a CSV file that looks like this:
county,precinct,office,district,party,candidate,votes
Clark,1,President,,REP,John R. Kasich,5
Clark,2,President,,REP,John R. Kasich,0
Clark,3,President,,REP,John R. Kasich,7
(Real example from https://github.com/openelections/openelections-data-sd/blob/ master/2016/20160607__sd__primary__clark__precinct.csv )
You can now convert selected columns into separate lookup tables using the new
--extract-column option (shortname: -c) - for example:
csvs-to-sqlite openelections-data-*/*.csv \
-c county:County:name \
-c precinct:Precinct:name \
-c office -c district -c party -c candidate \
openelections.db
The format is as follows:
column_name:optional_table_name:optional_table_value_column_name
If you just specify the column name e.g. `-c office`, the following table will
be created:
CREATE TABLE ""party"" (
""id"" INTEGER PRIMARY KEY,
""value"" TEXT
);
If you specify all three options, e.g. `-c precinct:Precinct:name` the table
will look like this:
CREATE TABLE ""Precinct"" (
""id"" INTEGER PRIMARY KEY,
""name"" TEXT
);
The original tables will be created like this:
CREATE TABLE ""ca__primary__san_francisco__precinct"" (
""county"" INTEGER,
""precinct"" INTEGER,
""office"" INTEGER,
""district"" INTEGER,
""party"" INTEGER,
""candidate"" INTEGER,
""votes"" INTEGER,
FOREIGN KEY (county) REFERENCES County(id),
FOREIGN KEY (party) REFERENCES party(id),
FOREIGN KEY (precinct) REFERENCES Precinct(id),
FOREIGN KEY (office) REFERENCES office(id),
FOREIGN KEY (candidate) REFERENCES candidate(id)
);
They will be populated with IDs that reference the new derived tables.
Closes #2
",110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.5,8575785,MDc6UmVsZWFzZTg1NzU3ODU=,0.5,master,csvs-to-sqlite 0.5,0,9599,0,2017-11-19T05:11:27Z,2017-11-19T05:53:25Z,[],"## Now handles columns with integers and nulls in correctly
Pandas does a good job of figuring out which SQLite column types should be
used for a DataFrame - with one exception: due to a limitation of NumPy it
treats columns containing a mixture of integers and NaN (blank values) as
being of type float64, which means they end up as REAL columns in SQLite.
http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na
To fix this, we now check to see if a float64 column actually consists solely
of NaN and integer-valued floats (checked using v.is_integer() in Python). If
that is the case, we over-ride the column type to be INTEGER instead.
See #5 - also a8ab524 and 0997b7b",110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.6,8651869,MDc6UmVsZWFzZTg2NTE4Njk=,0.6,master,csvs-to-sqlite 0.6,0,9599,0,2017-11-24T23:12:10Z,2017-11-24T23:16:45Z,[],"## SQLite full-text search support
- Added `--fts` option for setting up SQLite full-text search.
The `--fts` option will create a corresponding SQLite FTS virtual table, using
the best available version of the FTS module.
https://sqlite.org/fts5.html
https://www.sqlite.org/fts3.html
Usage:
csvs-to-sqlite my-csv.csv output.db -f column1 -f column2
Example generated with this option: https://sf-trees-search.now.sh/
Example search: https://sf-trees-search.now.sh/sf-trees-search-a899b92?sql=select+*+from+Street_Tree_List+where+rowid+in+%28select+rowid+from+Street_Tree_List_fts+where+Street_Tree_List_fts+match+%27grove+london+dpw%27%29%0D%0A
Will be used in https://github.com/simonw/datasette/issues/131
- `--fts` and `--extract-column` now cooperate.
If you extract a column and then specify that same column in the `--fts` list,
`csvs-to-sqlite` now uses the original value of that column in the index.
Example using CSV from https://data.sfgov.org/City-Infrastructure/Street-Tree-List/tkzw-k3nq
csvs-to-sqlite Street_Tree_List.csv trees-fts.db \
-c qLegalStatus -c qSpecies -c qSiteInfo \
-c PlantType -c qCaretaker -c qCareAssistant \
-f qLegalStatus -f qSpecies -f qAddress \
-f qSiteInfo -f PlantType -f qCaretaker \
-f qCareAssistant -f PermitNotes
Closes #9
- Handle column names with spaces in them.
- Added `csvs-to-sqlite --version` option.
Using http://click.pocoo.org/5/api/#click.version_option",110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.6.1,8652417,MDc6UmVsZWFzZTg2NTI0MTc=,0.6.1,master,csvs-to-sqlite 0.6.1,0,9599,0,2017-11-25T02:57:01Z,2017-11-25T02:58:25Z,[],"- `-f and -c` now work for single table multiple columns.
Fixes #12
",110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.7,8656486,MDc6UmVsZWFzZTg2NTY0ODY=,0.7,master,csvs-to-sqlite 0.7,0,9599,0,2017-11-26T03:11:33Z,2017-11-26T03:14:11Z,[],- Add -s option to specify input field separator (#13) [Jani Monoses],110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.8,10696701,MDc6UmVsZWFzZTEwNjk2NzAx,0.8,master,csvs-to-sqlite 0.8,0,9599,0,2018-04-24T15:08:37Z,2018-04-24T15:35:30Z,[],"- `-d` and `-df` options for specifying date/datetime columns, closes #33
- Maintain lookup tables in SQLite, refs #17
- `--index` option to specify which columns to index, closes #24
- Test confirming `--shape` and `--filename-column` and `-c` work together #25
- Use usecols when loading CSV if shape specified
- `--filename-column` is now compatible with `--shape`, closes #10
- `--no-index-fks` option
By default, csvs-to-sqlite creates an index for every foreign key column that is
added using the `--extract-column` option.
For large tables, this can dramatically increase the size of the resulting
database file on disk. The new `--no-index-fks` option allows you to disable
this feature to save on file size.
Refs #24 which will allow you to explicitly list which columns SHOULD have
an index created.
- Added `--filename-column` option, refs #10
- Fixes for Python 2, refs #25
- Implemented new `--shape` option - refs #25
- `--table` option for specifying table to write to, refs #10
- Updated README to cover `--skip-errors`, refs #20
- Add `--skip-errors` option (#20) [Jani Monoses]
- Less verbosity (#19) [Jani Monoses]
Only log `extract_columns` info when that option is passed.
- Add option for field quoting behaviour (#15) [Jani Monoses]",110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.9,15022807,MDc6UmVsZWFzZTE1MDIyODA3,0.9,master,csvs-to-sqlite 0.9,0,9599,0,2019-01-17T05:17:02Z,2019-01-17T05:20:23Z,[],"- Support for loading CSVs directly from URLs, thanks @betatim - #38
- New -pk/--primary-key options, closes #22
- Create FTS index for extracted column values
- Added --no-fulltext-fks option, closes #32
- Now using black for code formatting
- Bumped versions of dependencies",110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.9.1,18185234,MDc6UmVsZWFzZTE4MTg1MjM0,0.9.1,master,csvs-to-sqlite 0.9.1,0,9599,0,2019-06-24T15:16:54Z,2019-06-24T15:21:12Z,[],* Fixed bug where `-f` option used FTS4 even when FTS5 was available (#41),110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/0.9.2,18377238,MDc6UmVsZWFzZTE4Mzc3MjM4,0.9.2,master,csvs-to-sqlite 0.9.2,0,9599,0,2019-07-03T04:36:26Z,2019-07-03T04:37:15Z,[],Bumped dependencies and pinned pytest to version 4 (5 is incompatible with Python 2.7).,110509816
https://github.com/simonw/csvs-to-sqlite/releases/tag/1.0,19056866,MDc6UmVsZWFzZTE5MDU2ODY2,1.0,master,csvs-to-sqlite 1.0,0,9599,0,2019-08-03T10:50:48Z,2019-08-03T10:58:15Z,[],This release drops support for Python 2.x #55,110509816