Use CAN .DBC files to extract data from .MF4 log files using Asammdf, visualize each signal of data using Pandas and Matplotlib, complete anomaly detection for each data signal using mean and standard deviation percentile analysis, and generate PPT report (with graphs and potential abnormal data listed) using PPTX.
-
asammdf 5.20.6 or above
-
cchardet 2.1.5 (this version can install asammdf correctly, v2.1.6 will lead to potential download error)
-
lz4 3.1.0
-
-
PPTX
|
├── original data folder
---├── original_1.mf4
---├── original_2.mf4
---└── ...
├── test data 1 folder
---├── test1_1.mf4
---├── test1_2.mf4
---└── ...
├── test data 2 folder
---├── test2_1.mf4
---├── test2_2.mf4
---└── ...
└── ...
-
path
:path_data_dir
: the directory where the Data Folder (containing all data) locatespath_dbc_dir
: the directory where all the DBC files locate (it is suggested that all.dbc
files are put under one directory)path_signal_excel
: the Signal Checkpoint Excel file to decide which signal to choose and plotpath_to_create_folder
: the target directory to create a folder to put figurespath_to_create_ppt
: the target directory to create the PPT file
-
dbc_channels
: a dictionary, key is the CAN channel name, value is the corresponding list of.dbc
file name(s) (since the.dbc
files' location is specified inpath_dbc_dir
, it is enough to just include the.dbc
file name instead of the absolute path)Example: "Ch3": ['GWM V71 CAN 01C.dbc'] "Ch4": ['FR-IFC-Private CAN.dbc'] "Ch5": ['GWM V71 CAN 01C.dbc'] "Ch6": ['FR-IFC-Private CAN.dbc']
-
Reading & converting MF4 files: directly using
asammdf.MDF.extract_can_logging(dbc)
will lead to potential channel confusion if the DBC channels are not fixed for every MF4 log files. An alternative would be manually extracting every channel information from the.dbc
file, and doextract_can_logging
on every existing channels (this operation requiresasammdf.MDF.bus_logging_map
method) -
Different CAN signals have different data-collection frenquencies: when merging data into one dataframe, the data need to be aligned to the
camera id
signal so as to be analyzed and plotted. However, the data-collection frenquency ofcamera id
signal does not always match that of other signals, leading to some missing data cells in the merged dataframe (e.g.,camera id
might be collected every 10 ms, but another signal is collected every 15 ms). After discussing with validation engineers, I decided to useffill
(andbfill
) method to fill in the missing data, and keep the uniquecamera id
s after filling, so that there will not be duplicatedcamera id
in the plot, and the data's consistency is retained -
Some trivial structural problems: the data folder structure, the nested dictionary structure of
list_file_path
output andload_mf4_to_dic_for_all
, aiming to reduce time and space complexity when reading and accessing various data files-
Reading a data file containing over 4000 rows of timestamp data and 1500 columns of signals only takes 3 seconds
-
Generating dataframes and plotting for each signal only takes about 1 second
-
-
The robustness of this program can be enhanced: give fewer restrictions on the data folder's structure, and a better method of distinguihsing
Original
andTest
data files -
Can add a GUI window for the user to select data files
Below are some screenshots of the PPT report generated by this program. More Matplotlib figures can be accessed in Demo_fig
folder: