hats.catalog.association_catalog#
Submodules#
Classes#
A HATS Catalog for enabling fast joins between two HATS catalogs |
|
Association catalog metadata with which partitions matches occur in the join |
Package Contents#
- class AssociationCatalog(catalog_info: hats.catalog.dataset.table_properties.TableProperties, pixels: hats.catalog.partition_info.PartitionInfo | hats.pixel_tree.pixel_tree.PixelTree | list[hats.pixel_math.HealpixPixel], join_pixels: list | pandas.DataFrame | hats.catalog.association_catalog.partition_join_info.PartitionJoinInfo, catalog_path=None, moc: mocpy.MOC | None = None, schema: pyarrow.Schema | None = None)[source]#
Bases:
hats.catalog.healpix_dataset.healpix_dataset.HealpixDataset
A HATS Catalog for enabling fast joins between two HATS catalogs
Catalogs of this type are partitioned based on the partitioning of the left catalog. The partition_join_info metadata file specifies all pairs of pixels in the Association Catalog, corresponding to each pair of partitions in each catalog that contain rows to join.
- join_info#
- get_join_pixels() pandas.DataFrame [source]#
Get join pixels listing all pairs of pixels from left and right catalogs that contain matching association rows
- Returns:
pd.DataFrame with each row being a pair of pixels from the primary and join catalogs
- static _get_partition_join_info_from_pixels(join_pixels: list | pandas.DataFrame | hats.catalog.association_catalog.partition_join_info.PartitionJoinInfo) hats.catalog.association_catalog.partition_join_info.PartitionJoinInfo [source]#
- class PartitionJoinInfo(join_info_df: pandas.DataFrame, catalog_base_dir: str = None)[source]#
Association catalog metadata with which partitions matches occur in the join
- PRIMARY_ORDER_COLUMN_NAME = 'Norder'#
- PRIMARY_PIXEL_COLUMN_NAME = 'Npix'#
- JOIN_ORDER_COLUMN_NAME = 'join_Norder'#
- JOIN_PIXEL_COLUMN_NAME = 'join_Npix'#
- COLUMN_NAMES#
- data_frame#
- catalog_base_dir = None#
- primary_to_join_map() dict[hats.pixel_math.healpix_pixel.HealpixPixel, list[hats.pixel_math.healpix_pixel.HealpixPixel]] [source]#
Generate a map from a single primary pixel to one or more pixels in the join catalog.
Lots of cute comprehension is happening here, so watch out! We create tuple of (primary order/pixel) and [array of tuples of (join order/pixel)]
- Returns:
dictionary mapping (primary order/pixel) to [array of (join order/pixel)]
- write_to_csv(catalog_path: str | pathlib.Path | upath.UPath | None = None)[source]#
Write all partition data to CSV files.
Two files will be written:
partition_info.csv - covers all primary catalog pixels, and should match the file structure
partition_join_info.csv - covers all pairwise relationships between primary and join catalogs.
- Parameters:
catalog_path – path to the directory where the partition_join_info.csv file will be written
- Raises:
ValueError – if no path is provided, and could not be inferred.
- classmethod read_from_dir(catalog_base_dir: str | pathlib.Path | upath.UPath | None = None) PartitionJoinInfo [source]#
Read partition join info from a partition_join_info file within a hats directory.
- Parameters:
catalog_base_dir – path to the root directory of the catalog
- Returns:
A PartitionJoinInfo object with the data from the file
- Raises:
FileNotFoundError – if the desired file is found in the catalog_base_dir
- classmethod read_from_csv(partition_join_info_file: str | pathlib.Path | upath.UPath) PartitionJoinInfo [source]#
Read partition join info from a partition_join_info.csv file to create an object
- Parameters:
partition_join_info_file (UPath) – path to the partition_join_info.csv file
- Returns:
A PartitionJoinInfo object with the data from the file
- classmethod _read_from_csv(partition_join_info_file: str | pathlib.Path | upath.UPath) pandas.DataFrame [source]#
Read partition join info from a partition_join_info.csv file to create an object
- Parameters:
partition_join_info_file (UPath) – path to the partition_join_info.csv file
- Returns:
A PartitionJoinInfo object with the data from the file