Changelog 2022.3.10.dev1
Highlights
dataset_typeanddataset_hashpartnow produce one dataset for each source dataset.- New datetime parser, no longer uses strptime from libc.
This adds several new features and will probably not break existing uses.
- No longer allows spaces around each value, you can put spaces in your pattern if needed. Fixed length numbers are still allowed to use spaces as leading zeros though. This is the most breaking change.
- Wildcards for various character classes:
%whitespace (exactly one, plain space is any number)%.any character except whitespace%*any character including whitespace%#any digit%@any non-digit character, excluding whitespace%^any non-digit character, including whitespace- These also accept a
COUNTorLOW,HIGHrange, e.g.%2,5.matches 2 to 5 non-whitespace characters
%falso accepts aCOUNT,%6fis the same as the default (microseconds),%3fis milliseconds etc.- Optional elements:
%?makes the next element (orCOUNTorLOW,HIGHelements) optional. If less than LOW tokens match, no part of the string is consumed, no updates occur to the parsed value and matching continues after HIGH elements in the format string.%:isELSE(like:in?:in C), only executes the following group if the previous%?did not match.%-ignores the next element value. It must still parse as the specified type, but the parsed value is discarded.%/makes the rest of the pattern optional. Matching continues until it fails. Unconsumed input is still an error unless you have chosen to ignore trailing text using ani-suffixed type.
- New format code
%efor “Excel dates”. Three possible variants:%0eLibreOffice dates, epoch is 1899-12-30%1eLotus 1-2-3 dates, epoch is 1899-12-31 and 1900 is a leap year. (This is the default, since that is what Excel uses.)%2eExcel Mac dates, epoch is 1904-01-01.
In your methods
job.load()andjob.load_json()now take adefault=argument specifying an object you get back (instead of an exception) if the file does not exist.job.save()and friends never complain about missing temp= arguments, these silently default to False everywhere now.- Jobs now save what
input_directorythey ran with.- Available as
job.input_directoryand also inax job. - Changing
input_directorystill does not cause any jobs to be rebuilt.
- Available as
- Report the line reached when an exception occurs while iterating a dataset.
- Datasets no longer support the
bits32andbits64types. - Datasets have a
.chain_within_job()method. (Like.chain()but stopping as soon as the previous dataset is in a different job.) - The new datetime parser is exposed as
accelerator.dsutil.strptime()andaccelerator.dsutil.strptime_i().
Standard methods
dataset_typeanddataset_hashpartnow produce one dataset for each source dataset. This removes a lot of corner cases with datasets that differ across the chain.- The
as_chainargument ofdataset_typeanddataset_hashpartis nowchain_slices, to better match what it means. - More changes in
dataset_type:- Uses the new datetime parser.
- Remove the
floatinttypes, they were too specialised for general usefulness. - All columns can be renamed, not just typed ones.
- Typing errors now show the source ds, column name and sliceno instead of the much less useful filename.
- New method
dataset_concatwhich can be used afterdataset_typeordataset_hashpartto get a result more similar to the previous behaviour (but without support for varying columns). - Remove the
dataset_unbitsmethod again. (Only in the2022.8.4.dev1release.)
Shell
ax ds --location(-w) shows where a column originated. Both parent dataset and if applicable where it was typed from.ax jobcan show files produced by a job with--file(-f).- Three new options for
ax grep:--fixed-strings(-F), like on standard grep.--max-count(-m), like on standard grep.--numeric(-N) to match (ranges of) numeric values instead of text.
And finally
- More tests.
- Various bug fixes / minor improvements of course.
Note: Python 2 is still supported, but will be unsupported soon.
Additional Resources
The Accelerator’s Homepage (exax.org)
The Accelerator on Github/exaxorg
The Accelerator on PyPI
Reference Manual