API Reference

Contains documentation for lob_reconstructor library, created for UCLA.

orderbook Module

class lobster_reconstructor.orderbook.Orderbook(nlevels: int, ticker: str, tick_size: float, price_scaling: float = 0.0001, use_matching_engine: bool = False)[source]

Bases: object

Limit Order Book (LOB) data structure with support for order submission, cancellation, execution, OFI computation, and visualization.

Parameters:
  • nlevels (int) – Maximum number of price levels to store in the book.

  • ticker (str) – ticker symbol for the asset.

  • tick_size (float) – Minimum tick size (price increment).

  • price_scaling (float, default=0.0001) – Scaling factor to convert integer price representation to display prices (e.g. 0.0001 for LOBSTER data).

  • use_matching_engine (bool, default=False) – Whether to use built in matching engine to compute orderbook state. For exact reconstruction, set use_matching_engine=False. For clean snapshot visualization set use_matching_engine=True.

bids

Bid side of the order book, keyed by descending price.

Type:

SortedDict

asks

Ask side of the order book, keyed by ascending price.

Type:

SortedDict

curr_book_timestamp

Current timestamp of the order book.

Type:

float

midprice

Current midprice of the book, if defined.

Type:

float or None

midprice_change_timestamp

Timestamp of the last midprice change.

Type:

float

cum_OFI

Object tracking cumulative order flow imbalance (OFI).

Type:

OFI

trade_log

List of executed trades (as namedtuples).

Type:

list

available_vol_at_price(price: int) int[source]

Get total volume available at a given price.

Parameters:

price (int) – Price level.

Returns:

Aggregate volume at the specified price.

Return type:

int

bid_ask_spread() int[source]

Compute the bid-ask spread.

Returns:

Difference between lowest ask and highest bid price.

Return type:

int

calc_count_OFI() int[source]

Compute count-based Order Flow Imbalance (OFI).

Returns:

Net OFI based on order counts.

Return type:

int

calc_size_OFI() int[source]

Compute size-based Order Flow Imbalance (OFI).

Returns:

Net OFI based on order sizes.

Return type:

int

clear_orderbook() None[source]

Reset the order book to an empty state.

clear_trade_log() None[source]

Clear the trade log without affecting the order book.

convert_orderbook_to_L2_dataframe() DataFrame[source]

Convert the current order book state into a DataFrame containing L2 data. Captures nlevels of data (specified when orderbook is initialized).

Returns:

Pandas DataFrame with columns: - direction : {“bid”, “ask”} - price : int - size : int (aggregate volume at price level)

Return type:

DataFrame

convert_orderbook_to_L3_dataframe() DataFrame[source]

Convert the current order book state into a DataFrame containing L3 data. Captures nlevels of data (specified when orderbook is initialized).

Returns:

Pandas DataFrame with columns: - direction : {“bid”, “ask”} - price : int - size : int (aggregate volume at price level)

Return type:

DataFrame

display_L2_order_book() None[source]

Display the L2 order book as a bar chart.

Uses Plotly to show aggregate volume at each price level.

Warns:

UserWarning – If the order book is empty or plotting fails.

display_L3_order_book() None[source]

Display the L3 order book as a bar chart.

Uses Plotly to show aggregate volume at each price level.

Warns:

UserWarning – If the order book is empty or plotting fails.

highest_bid_price() int[source]

Get the current highest bid price.

Returns:

Highest bid price, or 0 if no bids exist.

Return type:

int

highest_bid_volume() int[source]

Get the total volume at the best bid price.

Returns:

Aggregate size of orders at the highest bid.

Return type:

int

lowest_ask_price() int[source]

Get the current lowest ask price.

Returns:

Lowest ask price, or np.inf if no asks exist.

Return type:

int

lowest_ask_volume() int[source]

Get the total volume at the best ask price.

Returns:

Aggregate size of orders at the lowest ask.

Return type:

int

meta_orders(time_delta=0) List[List[namedtuple]][source]

Group trades into meta-orders based on time and type.

Parameters:

time_delta (float, default=0) – Maximum allowed gap between trades to group.

Returns:

Grouped meta-orders.

Return type:

list of list of Trades (namedtuple(“Trade”, [“timestamp”, “trade_type”, “direction”, “size”, “price”, “order_id”])

mid_price() float | None[source]

Compute the midprice.

Returns:

Midprice if both sides exist, else None.

Return type:

float or None

opposite_side_book_depth(order: LimitOrder) int[source]

Get total depth of the opposite side of the book.

Parameters:

order (LimitOrder) – LimitOrder object containing event details. See LimitOrder in orders.py for full definition.

Returns:

Total volume on the opposite side.

Return type:

int

order_sweeps(time_delta=0, level_threshold=2) List[List[namedtuple]][source]

Identify order sweeps (large meta-orders across levels).

Parameters:
  • time_delta (float, default=0) – Maximum allowed gap between trades to group.

  • level_threshold (int, default=2) – Minimum number of unique price levels to qualify as a sweep.

Returns:

List of order sweeps.

Return type:

list of list of Trade (namedtuple(“Trade”, [“timestamp”, “trade_type”, “direction”, “size”, “price”, “order_id”])

orderbook_price_range() int[source]

Get the price range spanned by the order book.

Returns:

Difference between worst ask and worst bid.

Return type:

int

process_order(order: Order) None[source]

Process a new order message and update the book accordingly.

Parameters:

order (Order) – Order object containing event details. See Order in orders.py for full definition.

Raises:

ValueError – If the direction, event type, or timestamp is invalid.

reset_cum_OFI()[source]

Reset the cumulative Order Flow Imbalance (OFI) counters.

same_side_book_depth(order: LimitOrder) int[source]

Get total depth of the same side of the book.

Parameters:

order (LimitOrder) – LimitOrder object containing event details. See LimitOrder in orders.py for full definition.

Returns:

Total volume on the same side.

Return type:

int

symmetric_opposite_book_volume(order: LimitOrder) int[source]

Compute volume on the opposite side symmetric to the order price.

Parameters:

order (LimitOrder) – LimitOrder object containing event details. See LimitOrder in orders.py for full definition.

Returns:

Symmetric opposite-side volume.

Return type:

int

time_elapsed_since_first_available_order_with_same_price(order: LimitOrder) float[source]

Compute time elapsed since the first order at the same price.

Parameters:

order (LimitOrder) – LimitOrder object containing event details. See LimitOrder in orders.py for full definition.

Returns:

Time in seconds.

Return type:

float

time_elapsed_since_mid_price_change(order: LimitOrder) float[source]

Compute time elapsed since the last midprice change.

Parameters:

order (LimitOrder) – LimitOrder object containing event details. See LimitOrder in orders.py for full definition.

Returns:

Time in seconds.

Return type:

float

time_elapsed_since_most_recent_order_with_same_price(order: LimitOrder) float[source]

Compute time elapsed since the most recent order at the same price.

Parameters:

order (LimitOrder) – LimitOrder object containing event details. See LimitOrder in orders.py for full definition.

Returns:

Time in seconds.

Return type:

float

total_ask_volume() int[source]

Get total volume on the ask side.

Returns:

Sum of sizes across all ask levels.

Return type:

int

total_bid_volume() int[source]

Get total volume on the bid side.

Returns:

Sum of sizes across all bid levels.

Return type:

int

volume_of_higher_priority_orders(order: LimitOrder) int[source]

Get the total size of orders ahead of a given order in priority.

Parameters:

order (LimitOrder) – LimitOrder object containing event details. See LimitOrder in orders.py for full definition.

Returns:

Volume of higher-priority orders on the same side.

Return type:

int

worst_ask_price() int[source]

Get the worst (highest) ask price in the book.

Returns:

Worst ask price.

Return type:

int

worst_bid_price() int[source]

Get the worst (lowest) bid price in the book.

Returns:

Worst bid price.

Return type:

int

lobster_sim Module

class lobster_reconstructor.lobster_sim.LobsterSim(orderbook: Orderbook, msg_book_file_path: str, lob_book_file_path: str = None)[source]

Bases: object

LOBSTER simulation and visualization interface.

Provides functionality for replaying limit order book events, computing order flow imbalance (OFI), and generating visualizations using Plotly and Dash.

Parameters:
  • orderbook (Orderbook) – Orderbook object to operate on. See Orderbook in orderbook.py for full definition.

  • msg_book_file_path (str) – LOBSTER message.csv file path.

  • lob_book_file_path (str, default=None) – LOBSTER orderbook.csv file path. Not necessary for end user (just use default val), used solely in debugging/testing to ensure matching between reconstructed and expected.

orderbook

Orderbook object to operate on. See Orderbook in orderbook.py for full definition.

Type:

Orderbook

dataM

Contains message data pulled from LOBSTER message.csv with columns:

  • Time: float

  • Type: Literal[‘submit’, ‘cancel’, ‘delete’, ‘vis_exec’, ‘hid_exec’, ‘cross’, ‘halt’]

  • OrderID: int

  • Size: int

  • Price: int

  • Direction: Literal[‘bid’, ‘ask’]

Type:

pd.DataFrame

count_OFI_graph(start_time: float, end_time: float, frame_interval: float, reset_ofi_interval: float = inf) None[source]

Plots a time series graph of the cumulative Count Order Flow Imbalance (OFI).

This function simulates the order book over a specified time range, calculating the cumulative Count OFI at regular intervals and plotting the results. The Count OFI measures the imbalance between the number of buy and sell orders.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • frame_interval (float) – Time interval (in seconds) between each point plotted on the graph.

  • reset_ofi_interval (float, optional) – The time interval (in seconds) at which the cumulative OFI value is reset to zero. Defaults to np.inf, meaning the OFI is never reset within the plotting range.

create_animated_L3_app(start_time: float, end_time: float, interval: float) Dash[source]

Create an interactive Dash application showing an animated L3 order book.

The application displays horizontal bar charts of order sizes at each price level, updating over time to animate the evolution of the L3 book. Users can play/pause the animation or manually slide through frames.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • interval (float) – Time interval (in seconds) between consecutive frames.

Returns:

A Dash application instance that can be run or embedded in a web server.

Return type:

dash.Dash

Notes

  • The method simulates the order book over the specified interval and stores snapshots in memory.

  • Each frame shows a horizontal bar chart of L3 order sizes by price and direction.

  • Users can interact via a play/pause button and a slider for manual navigation.

depth_percentile_graph(start_time: float, end_time: float, interval: float) None[source]

Creates a heatmap graph of order book depth in basis points (BPS) from the mid-price.

The heatmap visualizes the depth of the order book relative to the mid-price over time. The x-axis is time, the y-axis is the price level in BPS from the mid-price, and the color intensity at each point represents the size (volume) at that price level. A white horizontal line at 0 BPS indicates the mid-price.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • interval (float) – Time interval (in seconds) between each data point (snapshot) on the heatmap.

display_L2_snapshots(start_time: float, end_time: float, interval: float) None[source]

Display multiple L2 order book snapshots as subplots over a specified time range. Simulates the order book from start_time to end_time and generates a Plotly figure with subplots showing the L3 state at each interval. Each subplot title includes the current time, midprice, and bid-ask spread.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation and plotting.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation and plotting.

  • interval (float) – Time interval (in seconds) between consecutive snapshots.

display_L3_snapshots(start_time: float, end_time: float, interval: float) None[source]

Display multiple L3 order book snapshots as subplots over a specified time range. Simulates the order book from start_time to end_time and generates a Plotly figure with subplots showing the L3 state at each interval. Each subplot title includes the current time, midprice, and bid-ask spread.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation and plotting.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation and plotting.

  • interval (float) – Time interval (in seconds) between consecutive snapshots.

graph_trade_arrival_time(start_time: float, end_time: float, bin_size: float = None, filter_trade_type: Literal['aggro_lim', 'vis_exec', 'hid_exec'] = None) None[source]

Graphs the arrival count of bid and ask trades over time.

The function simulates trades within a specified time range, aggregates them into time bins, and plots a bar chart showing the number of buy (bid) and sell (ask) trades in each bin.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • bin_size (float, optional) – The size of each time bin in seconds. If None, the bin size is set to 1/100th of the total time range.

  • filter_trade_type (Literal["aggro_lim", "vis_exec", "hid_exec"], optional) – A filter to display only a specific type of trade. Defaults to None, meaning all trade types are included.

graph_trade_size_distribution(start_time: float, end_time: float, bin_size: int = 20, filter_trade_type: Literal['aggro_lim', 'vis_exec', 'hid_exec'] = None) None[source]

Graphs the size distribution of bid and ask trades.

This function simulates trades within a specified time range, filters out outliers using Z-score, and then creates a bar chart showing the distribution of trade sizes for both bids and asks.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • bin_size (int, optional) – The size of each trade size bin. Defaults to 20.

  • filter_trade_type (Literal["aggro_lim", "vis_exec", "hid_exec"], optional) – A filter to display only a specific type of trade. Defaults to None, meaning all trade types are included.

midprice_graph(start_time: float, end_time: float, interval: float) None[source]

Plots a time series graph of the mid-price of the order book.

This function simulates the order book over a specified time range, capturing the mid-price at regular intervals and plotting the results as a line graph.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • interval (float) – Time interval (in seconds) between each data point plotted on the graph.

plot_price_levels_heatmap(start_time: float, end_time: float, interval: float, show_midprice: bool = True) None[source]

Creates a heatmap graph of order book price levels over time.

The heatmap visualizes the depth of the order book at different price levels over a specified time range. The x-axis represents time, the y-axis represents price, and the color intensity at each point indicates the total size (volume) of orders at that price level at that specific time.

Parameters:
  • start_time (float) – The timestamp in seconds after midnight to begin the simulation and plotting.

  • end_time (float) – The timestamp in seconds after midnight to end the simulation and plotting.

  • interval (float) – The time interval in seconds between each data point (snapshot) on the heatmap.

  • show_midprice (bool, optional) – If True, a white line representing the mid-price of the order book is overlaid on the heatmap. Defaults to True.

Notes

  • The self.simulate_until() and self.simulate_from_current_until() methods are used to advance the simulation and collect order book snapshots.

  • The price values are scaled by self.orderbook.price_scaling for accurate visualization.

  • This function uses the plotly.graph_objects library to generate an interactive heatmap.

print_features_to_csv(filename: str, start_time: float, end_time: float, interval: float, features: dict, batch_date: str, symbol: str, directory: str = '.', timestamp_round: int = 9) None[source]

Exports order book features to a CSV file with a default ‘timestamp’ column.

This function simulates the order book over a specified time range at fixed intervals, computes user-specified features, and writes the results to a CSV. If the file already exists and both its schema and ticker match, non-overlapping rows are appended and the data is re-sorted so earlier times appear first. Otherwise, the file is overwritten with the new data.

Parameters:
  • filename (str) – Base name for the CSV file (‘.csv’ is added automatically if missing).

  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • interval (float) – Time step in seconds between samples (must be > 0).

  • features (dict) – Dictionary where keys are feature names and values are dictionaries specifying the order book method to call and its arguments. Example: {“mid_price”: {“method”: “mid_price”, “args”: []}}.

  • batch_date (str) – The trading date to associate with the exported rows (e.g., “2025-08-20”).

  • symbol (str) – The ticker symbol; must match the file’s ticker to append, otherwise the file is overwritten.

  • directory (str, optional) – Output directory for the CSV. Defaults to the current directory “.”.

  • timestamp_round (int, optional) – Decimal places to round timestamps for overlap checks. Defaults to 9.

Returns:

Writes the features to a CSV file. Prints status messages indicating whether rows were written, appended, dropped due to overlap, or skipped.

Return type:

None

sim_count_OFI(start_time: float, end_time: float) int[source]

Simulate the order book between two timestamps and compute the cumulative count-based Order Flow Imbalance (OFI).

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

Returns:

The cumulative count-based OFI computed over the simulation interval.

Return type:

int

Notes

The method resets the cumulative OFI at the start of the simulation, then processes all messages between start_time and end_time.

sim_size_OFI(start_time: float, end_time: float) int[source]

Simulate the order book between two timestamps and compute the cumulative size-based Order Flow Imbalance (OFI).

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

Returns:

The cumulative size-based OFI computed over the simulation interval.

Return type:

int

Notes

The method resets the cumulative OFI at the start of the simulation, then processes all messages between start_time and end_time.

simulate_from_current_until(time: float) None[source]

Continue reconstructing the order book from the current simulation state up to a specified timestamp. Does NOT reset orderbook state, unlike simulate_until.

Parameters:

time (float) – Time in seconds after midnight to simulate until.

Raises:

ValueError – If time is earlier than the current order book timestamp.

simulate_until(time: float) None[source]

Resets orderbook state. Reconstructs orderbook state from beginning of message file until specified timestamp.

Parameters:

time (float) – Time in seconds after midnight to simulate until

size_OFI_graph(start_time: float, end_time: float, frame_interval: float, reset_ofi_interval: float = inf) None[source]

Plots a time series graph of the cumulative Size Order Flow Imbalance (OFI).

This function simulates the order book over a specified time range, calculating the cumulative Size OFI at regular intervals and plotting the results. The Size OFI measures the imbalance between the total size of buy and sell orders.

Parameters:
  • start_time (float) – Timestamp (seconds after midnight) to start the simulation.

  • end_time (float) – Timestamp (seconds after midnight) to end the simulation.

  • frame_interval (float) – Time interval (in seconds) between each point plotted on the graph.

  • reset_ofi_interval (float, optional) – The time interval (in seconds) at which the cumulative OFI value is reset to zero. Defaults to np.inf, meaning the OFI is never reset within the plotting range.

exception lobster_reconstructor.lobster_sim.MatchingError(side, csv_price, csv_size, recon_price, recon_size, message)[source]

Bases: Exception

ofi Module

class lobster_reconstructor.ofi.OFI(Lb: OFIPair = <factory>, La: OFIPair = <factory>, Db: OFIPair = <factory>, Da: OFIPair = <factory>, Mb: OFIPair = <factory>, Ma: OFIPair = <factory>)[source]

Bases: object

Represents the full set of components for Order Flow Imbalance (OFI).

This class encapsulates various OFI components, each represented by an OFIPair, tracking both size and count. The components are typically used to measure market pressure from different types of order book events.

Lb

Represents OFI from Limit Buy orders (new buy limits).

Type:

OFIPair

La

Represents OFI from Limit Ask orders (new sell limits).

Type:

OFIPair

Db

Represents OFI from Delete Buy orders (cancellations of buy limits).

Type:

OFIPair

Da

Represents OFI from Delete Ask orders (cancellations of sell limits).

Type:

OFIPair

Mb

Represents OFI from Market Buy orders (aggressor buys).

Type:

OFIPair

Ma

Represents OFI from Market Ask orders (aggressor sells).

Type:

OFIPair

Da: OFIPair
Db: OFIPair
La: OFIPair
Lb: OFIPair
Ma: OFIPair
Mb: OFIPair
reset()[source]
class lobster_reconstructor.ofi.OFIPair(size: int = 0, count: int = 0)[source]

Bases: object

Represents a pair of values for Order Flow Imbalance (OFI), specifically for size and count.

size

The cumulative size (volume) component of the OFI.

Type:

int, default=0

count

The cumulative count (number of orders) component of the OFI.

Type:

int, default=0

count: int = 0
reset()[source]
size: int = 0

orders Module

class lobster_reconstructor.orders.LimitOrder(timestamp: float, order_id: int, size: int, price: int, direction: Literal['bid', 'ask'])[source]

Bases: object

Represents a limit order in the order book.

Unlike Order, which is a raw message/event, a LimitOrder reflects the current resting state of an order in the book.

Parameters:
  • timestamp (float) – Time when the order was added in seconds after midnight.

  • order_id (int) – Unique identifier for the order.

  • size (int) – Remaining visible quantity of the order.

  • price (int) – Price level of the order (scaled to be an integer, e.g. x10000 for LOBSTER data).

  • direction ({'bid', 'ask'}) – Side of the order book: - ‘bid’ : Buy order. - ‘ask’ : Sell order.

direction: Literal['bid', 'ask']
order_id: int
price: int
size: int
timestamp: float
class lobster_reconstructor.orders.Order(timestamp: float, event_type: Literal['submit', 'cancel', 'delete', 'vis_exec', 'hid_exec', 'cross', 'halt'], order_id: int, size: int, price: int, direction: Literal['bid', 'ask'])[source]

Bases: object

Represents a raw order event message from the limit order book data.

Parameters:
  • timestamp (float) – Event timestamp, in seconds after midnight.

  • event_type ({'submit', 'cancel', 'delete', 'vis_exec', 'hid_exec', 'cross', 'halt'}) –

    Type of order book event:

    • ’submit’ : A new order submission.

    • ’cancel’ : A partial cancellation of an existing order.

    • ’delete’ : A complete removal of an existing order.

    • ’vis_exec’ : Execution against visible liquidity.

    • ’hid_exec’ : Execution against hidden liquidity.

    • ’cross’ : Crossing event (buy/sell imbalance).

    • ’halt’ : Trading halt event.

  • order_id (int) – Unique identifier for the order.

  • size (int) – Number of shares

  • price (int) – Price level of the order (scaled to be an integer, e.g. x10000 for LOBSTER data).

  • direction ({'bid', 'ask'}) – Side of the order book the order belongs to: - ‘bid’ : Buy side. - ‘ask’ : Sell side.

direction: Literal['bid', 'ask']
event_type: Literal['submit', 'cancel', 'delete', 'vis_exec', 'hid_exec', 'cross', 'halt']
order_id: int
price: int
size: int
timestamp: float

utils Module

lobster_reconstructor.utils.format_timestamp(seconds_from_midnight: float, display_micro=False) str[source]

Formats a timestamp in seconds from midnight into a human-readable string.

Parameters:
  • seconds_from_midnight (float) – The timestamp in seconds, measured from midnight (00:00:00).

  • display_micro (bool, optional) – If True, the output string will include microseconds. Defaults to False.

Returns:

A formatted string representing the time in HH:MM:SS or HH:MM:SS.ffffff format.

Return type:

str