Persistence ¶
The persistence subpackage handles data storage and retrieval, mainly to support backtesting.
- class BaseDataCatalog ( * args , ** kw ) ¶
-
Bases:
ABC
Provides a abstract base class for a queryable data catalog.
- class ParquetDataCatalog ( * args , ** kw ) ¶
-
Bases:
BaseDataCatalog
Provides a queryable data catalog persisted to files in parquet format.
- Parameters :
-
-
path ( str ) – The root path for this data catalog. Must exist and must be an absolute path.
-
fs_protocol ( str , default 'file' ) – The fsspec filesystem protocol to use.
-
fs_storage_options ( dict , optional ) – The fs storage options.
-
Warning
The catalog is not threadsafe.
- class BarDataWrangler ( BarType bar_type , Instrument instrument ) ¶
-
Bases:
object
Provides a means of building lists of Nautilus Bar objects.
- Parameters :
-
-
bar_type ( BarType ) – The bar type for the wrangler.
-
instrument ( Instrument ) – The instrument for the wrangler.
-
- process ( self , data: pd.DataFrame , double default_volume: float = 1000000.0 , int ts_init_delta: int = 0 ) ¶
-
Process the given bar dataset into Nautilus Bar objects.
Expects columns [‘open’, ‘high’, ‘low’, ‘close’, ‘volume’] with ‘timestamp’ index. Note: The ‘volume’ column is optional, will then use the default_volume .
- Parameters :
-
-
data ( pd.DataFrame ) – The data to process.
-
default_volume ( float ) – The default volume for each bar (if not provided).
-
ts_init_delta ( int ) – The difference in nanoseconds between the data timestamps and the ts_init value. Can be used to represent/simulate latency between the data source and the Nautilus system.
-
- Returns :
-
list[Bar]
- Raises :
-
ValueError – If data is empty.
- class QuoteTickDataWrangler ( Instrument instrument ) ¶
-
Bases:
object
Provides a means of building lists of Nautilus QuoteTick objects.
- Parameters :
-
instrument ( Instrument ) – The instrument for the data wrangler.
- process ( self , data: pd.DataFrame , double default_volume: float = 1000000.0 , int ts_init_delta: int = 0 ) ¶
-
Process the given tick dataset into Nautilus QuoteTick objects.
Expects columns [‘bid_price’, ‘ask_price’] with ‘timestamp’ index. Note: The ‘bid_size’ and ‘ask_size’ columns are optional, will then use the default_volume .
- Parameters :
-
-
data ( pd.DataFrame ) – The tick data to process.
-
default_volume ( float ) – The default volume for each tick (if not provided).
-
ts_init_delta ( int ) – The difference in nanoseconds between the data timestamps and the ts_init value. Can be used to represent/simulate latency between the data source and the Nautilus system. Cannot be negative.
-
- Returns :
-
list[QuoteTick]
- process_bar_data ( self , bid_data: pd.DataFrame , ask_data: pd.DataFrame , double default_volume: float = 1000000.0 , int ts_init_delta: int = 0 , int random_seed: Optional[int] = None , bool is_raw: bool = False ) ¶
-
Process the given bar datasets into Nautilus QuoteTick objects.
Expects columns [‘open’, ‘high’, ‘low’, ‘close’, ‘volume’] with ‘timestamp’ index. Note: The ‘volume’ column is optional, will then use the default_volume .
- Parameters :
-
-
bid_data ( pd.DataFrame ) – The bid bar data.
-
ask_data ( pd.DataFrame ) – The ask bar data.
-
default_volume ( float ) – The volume per tick if not available from the data.
-
ts_init_delta ( int ) – The difference in nanoseconds between the data timestamps and the ts_init value. Can be used to represent/simulate latency between the data source and the Nautilus system.
-
random_seed ( int , optional ) – The random seed for shuffling order of high and low ticks from bar data. If random_seed is
None
then won’t shuffle. -
is_raw ( bool , default False ) – If the data is scaled to the Nautilus fixed precision.
-
- class TradeTickDataWrangler ( Instrument instrument ) ¶
-
Bases:
object
Provides a means of building lists of Nautilus TradeTick objects.
- Parameters :
-
instrument ( Instrument ) – The instrument for the data wrangler.
- process ( self , data: pd.DataFrame , int ts_init_delta: int = 0 , bool is_raw=False ) ¶
-
Process the given trade tick dataset into Nautilus TradeTick objects.
- Parameters :
-
-
data ( pd.DataFrame ) – The data to process.
-
ts_init_delta ( int ) – The difference in nanoseconds between the data timestamps and the ts_init value. Can be used to represent/simulate latency between the data source and the Nautilus system.
-
is_raw ( bool , default False ) – If the data is scaled to the Nautilus fixed precision.
-
- Raises :
-
ValueError – If data is empty.
- list_from_capsule ( capsule ) list [ Data ] ¶
- class LinePreprocessor ¶
-
Bases:
object
Provides pre-processing lines before they are passed to a Reader class (currently only TextReader ).
Used if the input data requires any pre-processing that may also be required as attributes on the resulting Nautilus objects that are created.
Examples
For example, if you were logging data in Python with a prepended timestamp, as below:
2021-06-29T06:03:14.528000 - {“op”:”mcm”,”pt”:1624946594395,”mc”:[{“id”:”1.179082386”,”rc”:[{“atb”:[[1.93,0]]}]}
The raw JSON data is contained after the logging timestamp, additionally we would also want to use this timestamp as the Nautilus ts_init value. In this instance, you could use something like:
>>> class LoggingLinePreprocessor(LinePreprocessor): >>> @staticmethod >>> def pre_process(line): >>> timestamp, json_data = line.split(' - ') >>> yield json_data, {'ts_init': pd.Timestamp(timestamp)} >>> >>> @staticmethod >>> def post_process(obj: Any, state: dict): >>> obj.ts_init = state['ts_init'] >>> return obj
- class Reader ( instrument_provider : Optional [ InstrumentProvider ] = None , instrument_provider_update : Optional [ Callable ] = None ) ¶
-
Bases:
object
Provides parsing of raw byte blocks to Nautilus objects.
- class ByteReader ( block_parser : Callable , instrument_provider : Optional [ InstrumentProvider ] = None , instrument_provider_update : Optional [ Callable ] = None ) ¶
-
Bases:
Reader
A Reader subclass for reading blocks of raw bytes; byte_parser will be passed a blocks of raw bytes.
- Parameters :
-
-
block_parser ( Callable ) – The handler which takes a blocks of bytes and yields Nautilus objects.
-
instrument_provider ( InstrumentProvider , optional ) – The instrument provider for the reader.
-
instrument_provider_update ( Callable , optional ) – An optional hook/callable to update instrument provider before data is passed to byte_parser (in many cases instruments need to be known ahead of parsing).
-
- class TextReader ( line_parser : Callable , line_preprocessor : Optional [ LinePreprocessor ] = None , instrument_provider : Optional [ InstrumentProvider ] = None , instrument_provider_update : Optional [ Callable ] = None , newline : bytes = b'\n' ) ¶
-
Bases:
ByteReader
A Reader subclass for reading lines of a text-like file; line_parser will be passed a single row of bytes.
- Parameters :
-
-
line_parser ( Callable ) – The handler which takes byte strings and yields Nautilus objects.
-
line_preprocessor ( Callable , optional ) – The context manager for pre-processing (cleaning log lines) of lines before json.loads is called. Nautilus objects are returned to the context manager for any post-processing also (for example, setting the ts_init ).
-
instrument_provider ( InstrumentProvider , optional ) – The instrument provider for the reader.
-
instrument_provider_update ( Callable , optional ) – An optional hook/callable to update instrument provider before data is passed to line_parser (in many cases instruments need to be known ahead of parsing).
-
newline ( bytes ) – The newline char value.
-
- class CSVReader ( block_parser : Callable , instrument_provider : Optional [ InstrumentProvider ] = None , instrument_provider_update : Optional [ Callable ] = None , header : Optional [ list [ str ] ] = None , chunked : bool = True , as_dataframe : bool = True , separator : str = ',' , newline : bytes = b'\n' , encoding : str = 'utf-8' ) ¶
-
Bases:
Reader
Provides parsing of CSV formatted bytes strings to Nautilus objects.
- Parameters :
-
-
block_parser ( callable ) – The handler which takes byte strings and yields Nautilus objects.
-
instrument_provider ( InstrumentProvider , optional ) – The readers instrument provider.
-
instrument_provider_update – Optional hook to call before parser for the purpose of loading instruments into an InstrumentProvider
-
header ( list [ str ] , default None ) – If first row contains names of columns, header has to be set to None . If data starts right at the first row, header has to be provided the list of column names.
-
chunked ( bool , default True ) – If chunked=False, each CSV line will be passed to block_parser individually, if chunked=True, the data passed will potentially contain many lines (a block).
-
as_dataframe ( bool , default False ) – If as_dataframe=True, the passes block will be parsed into a DataFrame before passing to block_parser .
-
- class ParquetReader ( parser : Optional [ Callable ] = None , instrument_provider : Optional [ InstrumentProvider ] = None , instrument_provider_update : Optional [ Callable ] = None ) ¶
-
Bases:
ByteReader
Provides parsing of parquet specification bytes to Nautilus objects.
- Parameters :
-
-
parser ( Callable ) – The parser.
-
instrument_provider ( InstrumentProvider , optional ) – The readers instrument provider.
-
instrument_provider_update ( Callable , optional ) – An optional hook/callable to update instrument provider before data is passed to byte_parser (in many cases instruments need to be known ahead of parsing).
-
- class StreamingEngine ( data_configs : list [ nautilus_trader.config.backtest.BacktestDataConfig ] , target_batch_size_bytes : int = 512000000 ) ¶
-
Bases:
_BufferIterator
Streams merged batches of Nautilus objects from BacktestDataConfig objects.
- extract_generic_data_client_ids ( data_configs : list [ nautilus_trader.config.backtest.BacktestDataConfig ] ) dict ¶
-
Extract a mapping of data_type : client_id from the list of data_configs . In the process of merging the streaming data, we lose the client_id for generic data, we need to inject this back in so the backtest engine can be correctly loaded.
- class StreamingFeatherWriter ( path : str , logger : LoggerAdapter , fs_protocol : Optional [ str ] = 'file' , flush_interval_ms : Optional [ int ] = None , replace : bool = False , include_types : Optional [ tuple [ type ] ] = None ) ¶
-
Bases:
object
Provides a stream writer of Nautilus objects into feather files.
- Parameters :
-
-
path ( str ) – The path to persist the stream to.
-
logger ( LoggerAdapter ) – The logger for the writer.
-
fs_protocol ( str , default 'file' ) – The fsspec file system protocol.
-
flush_interval_ms ( int , optional ) – The flush interval (milliseconds) for writing chunks.
-
replace ( bool , default False ) – If existing files at the given path should be replaced.
-
- write ( obj : object ) None ¶
-
Write the object to the stream.
- Parameters :
-
obj ( object ) – The object to write.
- Raises :
-
ValueError – If obj is
None
.
- check_flush ( ) None ¶
-
Flush all stream writers if current time greater than the next flush interval.
- flush ( ) None ¶
-
Flush all stream writers.
- close ( ) None ¶
-
Flush and close all stream writers.
- generate_signal_class ( name : str , value_type : type ) type ¶
-
Dynamically create a Data subclass for this signal.
- Parameters :
-
-
name ( str ) – The name of the signal data.
-
value_type ( type ) – The type for the signal data value.
-
- Returns :
-
SignalData