From 133dcb05780cf1f7ccfd5480ed127617ea83524a Mon Sep 17 00:00:00 2001 From: hugoval <76450221+yudgugger@users.noreply.github.com> Date: Tue, 7 May 2024 19:14:25 +0200 Subject: [PATCH] Update README.md --- doc/README.md | 57 ++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 47 insertions(+), 10 deletions(-) diff --git a/doc/README.md b/doc/README.md index d97a3f31..65c90c16 100644 --- a/doc/README.md +++ b/doc/README.md @@ -99,12 +99,13 @@
  • VIII.2.c) Figures
  • -
  • VIII.3) Curve clustering (K-Means) +
  • VIII.3) Loop clustering (K-Means)
  • VIII.4) Mean hysteresis @@ -1581,10 +1582,10 @@ The measurement parameters are extracted from the SSPFM measurement sheet, and e Graph showing the evolution of the phase offset determined by the script toolbox/phase_offset_analyzer.py as a function of the index of the raw SSPFM measurement files (figure generated with generate_graph_offset function of toolbox/phase_offset_analyzer.py script).

    -### VIII.3) Curve clustering (K-Means) +### VIII.3) Loop clustering (K-Means or GMM)

    -The script can be executed directly using the executable file: toolbox/curve_clustering.py or through the graphical user interface: gui/curve_clustering.py. It facilitates the classification of loops associated with each measurement point into clusters. This tool can enable phase separation or the separation of the influences of physically distinct phenomena, such as measurement artifacts. +The script can be executed directly using the executable file: toolbox/loop_clustering.py or through the graphical user interface: gui/loop_clustering.py. It facilitates the classification of loops associated with each measurement point into clusters. This tool can enable phase separation or the separation of the influences of physically distinct phenomena, such as measurement artifacts.

    #### VIII.3.a) Parameters @@ -1608,7 +1609,7 @@ The script can be executed directly using the executable file: • File management: In the initial phase, the algorithm ingests the best_nanoloops directory along with the properties directory.
    • Method: Method used to perform clusterin : K-Means or Gaussian Mixture Model (GMM).
    -• Label measure: One or more measure considered to determine the curve (from amplitude, phase, piezoresponse, resonance frequency or quality).
    +• Label measure: One or more measure considered to determine the loop (from amplitude, phase, piezoresponse, resonance frequency or quality).
    • Clusters: For each measurement (on field, off field, and coupled), the user specifies the number of clusters.
    • Save and plot parameters: Pertaining to the management of display and the preservation of outcomes.

    @@ -1621,8 +1622,8 @@ The script can be executed directly using the executable file:
    The entirety of data stemming from the best nanoloops, both in the on field and off field modes, is extracted from the files residing within the best_nanoloops directory (with the function extract_data of the script).
    -For piezoresponse curve analysis, vertical offset measurements in the off field mode and the dimensions of the mappings are drawn from the files within the properties directory (with extract_properties function of the script
    utils/nanoloop_to_hyst/file.py).
    -For piezoresponse curve analysis, the coupled measurements are subsequently generated through the process of differential analysis of on field and off field measurements, with the flexibility to incorporate the vertical offset in the off field mode, a component influenced by the sample's surface contact potential (section VI.4.d) - Differential analysis of on and off field hysteresis in the documentation). +For piezoresponse loop analysis, vertical offset measurements in the off field mode and the dimensions of the mappings are drawn from the files within the properties directory (with extract_properties function of the script utils/nanoloop_to_hyst/file.py).
    +For piezoresponse loop analysis, the coupled measurements are subsequently generated through the process of differential analysis of on field and off field measurements, with the flexibility to incorporate the vertical offset in the off field mode, a component influenced by the sample's surface contact potential (section VI.4.d) - Differential analysis of on and off field hysteresis in the documentation).

    @@ -1633,7 +1634,7 @@ For a deeper understanding of the input file management, please refer to the rel #### VIII.3.c) Treatment

    -Initially, following data extraction, a curve is constructed. If multiple measures are specified in the label_meas parameter, they are normalized between 0 and 1 and concatenated together in the gen_curve_data function of the script. To analyze ferroelectric as well as electrostatic effects, the quality of clusterization can be enhanced by composing amplitude with phase rather than simply relying on piezoresponse [28]. To study nanomechanical properties under in situ material polarization, resonance frequency and quality factor curves can be selected (for elastic and dissipative properties, respectively). This can be particularly relevant, for example, in the study of relaxor ferroelectric materials [7]. For each of the modes (on field, off field, and eventually coupled), and for each of the nanoloops associated with each data point, a cluster is assigned using the machine learning K-Means or GMM methodology. To accomplish this, we import the KMeans and GMM functions from sklearn.cluster. A reference cluster is established, identified as the one encompassing the maximum number of data points. The index assigned to the other clusters is then computed as the distance between their centroid and that of the reference cluster, respectively. In other words, the clustering indexing provides the user with information about the separation (determined with quantitative data) of each cluster relative to the reference cluster. Subsequently, an average curve for each cluster is computed. +Initially, following data extraction, a loop is constructed. If multiple measures are specified in the label_meas parameter, they are normalized between 0 and 1 and concatenated together in the gen_loop_data function of the script. To analyze ferroelectric as well as electrostatic effects, the quality of clusterization can be enhanced by composing amplitude with phase rather than simply relying on piezoresponse [28]. To study nanomechanical properties under in situ material polarization, resonance frequency and quality factor loops can be selected (for elastic and dissipative properties, respectively). This can be particularly relevant, for example, in the study of relaxor ferroelectric materials [7]. For each of the modes (on field, off field, and eventually coupled), and for each of the nanoloops associated with each data point, a cluster is assigned using the machine learning K-Means or GMM methodology. To accomplish this, we import the KMeans and GMM functions from sklearn.cluster. A reference cluster is established, identified as the one encompassing the maximum number of data points. The index assigned to the other clusters is then computed as the distance between their centroid and that of the reference cluster, respectively. In other words, the clustering indexing provides the user with information about the separation (determined with quantitative data) of each cluster relative to the reference cluster. Subsequently, an average loop for each cluster is computed.

    #### VIII.3.d) Figures @@ -1643,17 +1644,53 @@ Initially, following data extraction, a curve is constructed. If multiple measur
    - Result of curve_clustering (figure generated with main_curve_clustering function of toolbox/curve_clustering.py script) + Result of loop_clustering (figure generated with main_loop_clustering function of toolbox/loop_clustering.py script)

    For each mode (on field, off field, and eventually coupled), three figures are generated, each containing:
    • A complete representation of cluster points in 2D graph with their centroids, distinguished by colors assigned based on their cluster index.
    -• The complete array of curve from all datasets, distinguished by colors assigned based on their cluster index.
    -• The average curve for each cluster, distinguished by colors assigned according to their cluster index.
    +• The complete array of loop from all datasets, distinguished by colors assigned based on their cluster index.
    +• The average loop for each cluster, distinguished by colors assigned according to their cluster index.
    • A spatial cartography displaying the assigned clusters.

    +#### VIII.3.e) Curve clustering (K-Means or GMM) + +

    +A similar version of this code also exists for SSPFM raw data, typically used for clustering measurements of deflection or height channels. +

    + +

    +The script can be executed directly using the executable file: toolbox/curve_clustering.py or through the graphical user interface: gui/curve_clustering.py. It facilitates the classification of curves associated with each measurement point into clusters. This tool can enable phase separation or the separation of the influences of physically distinct phenomena, such as measurement artifacts. +

    + +``` + default_user_parameters = { + 'dir path in': '', + 'csv path in': '', + 'dir path out': '', + 'extension': 'spm', + 'mode': 'classic', + 'method': "kmeans", + 'label meas': ["deflection"], + 'nb clusters': 4, + 'verbose': True, + 'show plots': True, + 'save': False, + } +``` + +

    +• File management: In the initial phase, the algorithm ingests the raw measurement directory.
    +• Extension: Extension of raw SSPFM measurement files (from spm, txt, csv or xlsx).
    +• Mode: Measurement mode of raw SSPFM measurement files (from classic (for Frequency Sweep in Resonance or Single Frequency) or dfrt).
    +• Method: Method used to perform clusterin : K-Means or Gaussian Mixture Model (GMM).
    +• Label measure: One or more measure considered to determine the loop (from height, deflection, ...).
    +• Clusters: For each measurement, the user specifies the number of clusters.
    +• Save and plot parameters: Pertaining to the management of display and the preservation of outcomes.
    +

    + ### VIII.4) Mean hysteresis