fontr.pipelines.bcf_preprocessing package
Complete Data Processing pipeline for the adobe dataset
Submodules
fontr.pipelines.bcf_preprocessing.nodes module
- read_bcf_metadata(bcf_file)[source]
Reads metadata of bcf file using the potiner given as the argument.
- The .bcf format looks as follows:
8 bytes - n number of the .png files in the .bcf file.
8n - size of each .png file.
n - .png files stored as raw bytes.
- Parameters:
bcf_file (fsspec.core.OpenFile) – File descriptior to the .bcf file.
- Returns:
File descriptor to the .bcf file. Read sizes of the .png files.
- Return type:
tuple[fsspec.core.OpenFile, np.ndarray]
- read_labels(label_file)[source]
Stores reads labels saved under label_file and converts it into a cvs file
- Parameters:
label_file (fsspec.core.OpenFile) – File descriptor to the .label file
- Returns:
Read labels as dataframe
- Return type:
pd.DataFrame
- upload_bcf_as_png(bcf_file, file_sizes, output_path)[source]
Stores .png files stored in a .bcf files in a output_path.
- Parameters:
bcf_file (fsspec.core.OpenFile) – File descriptior to the .bcf file.
file_sizes (np.ndarray) – File sizes read in read_bcf_metadata node.
output_path (str) – Path where the .png files are stored
- Return type:
None