Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate duplicate nodes for files #22

Open
HickeyHsu opened this issue May 26, 2022 · 2 comments
Open

Generate duplicate nodes for files #22

HickeyHsu opened this issue May 26, 2022 · 2 comments

Comments

@HickeyHsu
Copy link

This is a very exciting tool.
I try to use it to generate the FILE_RESULT_DEPENDENCY_GRAPH of Java project, and then found a problem.
For each Java file, the tool will generate two nodes: 1) file node (with absolute_name) and 2) class node (only with display name)

for example, node<D:\idea_workspace\ACPG4J\src\main\java\analyser4J\graph\AbstractVertex.java> and node<analyser4J.graph.AbstractVertex>

In my opinion, they should be regarded as one single node.
Wondering if there is a solution.

Again, this project is pretty amazing. Thanks!

@glato
Copy link
Owner

glato commented May 29, 2022

@HickeyHsu Thanks for the nice feedback 👍. This also relates to the issue #23 that you've described? If not - could you give me a detailed example (and maybe even post a small screenshot of the graph/issue)? Thanks for your help!

@HickeyHsu
Copy link
Author

@HickeyHsu Thanks for the nice feedback 👍. This also relates to the issue #23 that you've described? If not - could you give me a detailed example (and maybe even post a small screenshot of the graph/issue)? Thanks for your help!

It's a more common problem than #23 . Almost every self-defined class would be duplicated.
For example, I define a class Graphviz in file ACPG4J\src\main\java\analyser4J\util\Graphviz.java; and I import it in another fileACPG4J\src\main\java\module\cpg\graphs\cpg\CodePropertyGraph.java.

First node would be generated from files that defined the class, likes:

<node id="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\util\Graphviz.java">
      <data key="d0">D:\idea_workspace\ACPG4J\src\main\java\analyser4J\util\Graphviz.java</data>
      <data key="d1">Graphviz.java</data>
      <data key="d2">19</data>
      <data key="d3">156</data>
      <data key="d231">0.33919588761078556</data>
      <data key="d232">0.30083373993452384</data>
      <data key="d35">0.2775904223555037</data>
      <data key="d233">0.2612145985576437</data>
      <data key="d234">0.2612145985576437</data>
      <data key="d235">0.24112116789936341</data>
      <data key="d73">0.2089258235662492</data>
      <data key="d9">1</data>
      <data key="d10">0</data>
      <data key="d11">6</data>
    </node>

The second node would be generated from the import at another class, likes:

    <node id="analyser4J.util.Graphviz">
      <data key="d1">analyser4J.util.Graphviz</data>
      <data key="d233">0.4036928296883384</data>
      <data key="d35">0.2983585580188285</data>
      <data key="d234">0.3139833119798187</data>
      <data key="d283">0.26912855312555894</data>
      <data key="d284">0.24414695886471777</data>
      <data key="d285">0.22427379427129912</data>
      <data key="d286">0.22427379427129912</data>
      <data key="d9">9</data>
      <data key="d10">1</data>
      <data key="d11">6</data>
      <data key="d21">0</data>
      <data key="d22">0</data>
      <data key="d12">1</data>
      <data key="d13">6</data>
    </node>

with an edge:

<edge source="D:\idea_workspace\ACPG4J\src\main\java\module\cpg\graphs\cpg\CodePropertyGraph.java" target="analyser4J.util.Graphviz"

In fact, by looking at the edge collection, we can see that each edge starts with a file and ends with a class entity:

    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.astgen.finder.NodeFinderConfig" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.astgen.finder.NodeLocator" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.astgen.helpers.FilePosConverter" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.astgen.helpers.FileSystemHelpers" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.builder.PDGBuilder" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.builder.PDGBuilderConfig" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.builder.SlicedPDGBuilder" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.graph.PDG" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.graph.Vertex" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.graph.VertexType" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.slice.Slicer" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.slice.config.LineNumSliceConfig" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.slice.config.SliceConfig" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.util.DotGraphExporter" />
    <edge source="D:\idea_workspace\ACPG4J\src\main\java\analyser4J\PDGSlicing.java" target="analyser4J.util.Utils" />
......

I also tried to fix it by using Classpaths instead of absolute paths:

    def calculate_dependency_graph_from_results_file_merged(self, results: Dict[str, Any]) -> None:
        """Constructs a dependency graph from a list of abstract file results.
            merge same nodes
        Args:
            results (List[AbstractFileResult]): A list of objects that subclass AbstractFileResult.
        """
        LOGGER.debug('creating dependency graph...')
        result:FileResult
        for _, result in results.items():
            # node_name = result.unique_name
            absolute_name = result.absolute_name
            display_name = result.display_name
            node_name=result.module_name+"."+Path(display_name).stem
            self._digraph.add_node(node_name, absolute_name=absolute_name, display_name=display_name)
            dependencies = result.scanned_import_dependencies
            for dependency in dependencies:
                self._digraph.add_node(dependency, display_name=dependency)
                self._digraph.add_edge(node_name, dependency)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants