Skip to content

Conversation

@yangxk1
Copy link
Contributor

@yangxk1 yangxk1 commented May 20, 2025

Reason for this PR

As described in issue #676 , there is currently no unit test to verify loading graph in java-info module. This PR adds unit tests for the Java info module to ensure correct loading of GraphInfo, VertexInfo, EdgeInfo, PropertyGroup, AdjacentList, and Property.

What changes are included in this PR?

Added assertions in GraphLoaderTest#testLoad to validate metadata integrity for key info classes using test datasets.

Are these changes tested?

yes

Are there any user-facing changes?

no

@Thespica Thespica requested a review from Copilot May 23, 2025 10:44
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds unit tests to the Java info module to verify that GraphLoader correctly populates GraphInfo, VertexInfo, and EdgeInfo metadata.

  • Refactors testLoad to separate assertions into testGraphInfo, testVertexInfo, and testEdgeInfo
  • Adds detailed Assert statements covering names, prefixes, file paths, chunk sizes, property groups, and adjacent lists
  • Imports protocol enums (FileType, DataType, AdjListType) for richer assertions
Comments suppressed due to low confidence (3)

maven-projects/info/src/test/java/org/apache/graphar/info/GraphLoaderTest.java:45

  • Instead of catching IOException and rethrowing as RuntimeException, consider declaring testLoad() with throws IOException and remove the try/catch to simplify the test.
public void testLoad() {

maven-projects/info/src/test/java/org/apache/graphar/info/GraphLoaderTest.java:96

  • [nitpick] Variable name firstName_lastName_gender uses snake_case; consider renaming to firstNameLastNameGenderGroup to follow Java lowerCamelCase conventions.
PropertyGroup firstName_lastName_gender = personVertexInfo.getPropertyGroups().get(1);

maven-projects/info/src/test/java/org/apache/graphar/info/GraphLoaderTest.java:74

  • The expected path contains a double slash (//), which may indicate unintended concatenation logic; verify and normalize path generation to avoid duplicate separators.
Assert.assertEquals("vertex/person//vertex_count", personVertexInfo.getVerticesNumFilePath()); //one more '/'

private void testGraphInfo(GraphInfo graphInfo) {
Assert.assertNotNull(graphInfo);
Assert.assertEquals("ldbc_sample", graphInfo.getName());
Assert.assertEquals("", graphInfo.getPrefix()); // is empty string?
Copy link

Copilot AI May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The inline question comment is unclear; remove // is empty string? or replace it with a clear explanation of why an empty prefix is expected.

Suggested change
Assert.assertEquals("", graphInfo.getPrefix()); // is empty string?
Assert.assertEquals("", graphInfo.getPrefix()); // The prefix is expected to be empty as the graph data does not define a prefix.

Copilot uses AI. Check for mistakes.
Comment on lines 139 to 149
AdjacentList adjOrderBySource = knowsEdge.getAdjacentList(AdjListType.ORDERED_BY_SOURCE);
Assert.assertEquals(FileType.CSV, adjOrderBySource.getFileType());
Assert.assertEquals(AdjListType.ORDERED_BY_SOURCE, adjOrderBySource.getType());
Assert.assertEquals("ordered_by_source/", adjOrderBySource.getPrefix());
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/vertex_count", knowsEdge.getVerticesNumFilePath(AdjListType.ORDERED_BY_SOURCE));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/edge_count0", knowsEdge.getEdgesNumFilePath(AdjListType.ORDERED_BY_SOURCE, 0));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/edge_count4", knowsEdge.getEdgesNumFilePath(AdjListType.ORDERED_BY_SOURCE, 4));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list", knowsEdge.getAdjacentListPrefix(AdjListType.ORDERED_BY_SOURCE));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/chunk0", knowsEdge.getAdjacentListChunkPath(AdjListType.ORDERED_BY_SOURCE, 0));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/chunk4", knowsEdge.getAdjacentListChunkPath(AdjListType.ORDERED_BY_SOURCE, 4));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/offset", knowsEdge.getOffsetPrefix(AdjListType.ORDERED_BY_SOURCE));
Copy link

Copilot AI May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The large number of repetitive assertions for each AdjListType could be extracted into a helper or parameterized test to reduce duplication and improve readability.

Suggested change
AdjacentList adjOrderBySource = knowsEdge.getAdjacentList(AdjListType.ORDERED_BY_SOURCE);
Assert.assertEquals(FileType.CSV, adjOrderBySource.getFileType());
Assert.assertEquals(AdjListType.ORDERED_BY_SOURCE, adjOrderBySource.getType());
Assert.assertEquals("ordered_by_source/", adjOrderBySource.getPrefix());
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/vertex_count", knowsEdge.getVerticesNumFilePath(AdjListType.ORDERED_BY_SOURCE));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/edge_count0", knowsEdge.getEdgesNumFilePath(AdjListType.ORDERED_BY_SOURCE, 0));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/edge_count4", knowsEdge.getEdgesNumFilePath(AdjListType.ORDERED_BY_SOURCE, 4));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list", knowsEdge.getAdjacentListPrefix(AdjListType.ORDERED_BY_SOURCE));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/chunk0", knowsEdge.getAdjacentListChunkPath(AdjListType.ORDERED_BY_SOURCE, 0));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/chunk4", knowsEdge.getAdjacentListChunkPath(AdjListType.ORDERED_BY_SOURCE, 4));
Assert.assertEquals("edge/person_knows_person//ordered_by_source//adj_list/offset", knowsEdge.getOffsetPrefix(AdjListType.ORDERED_BY_SOURCE));
assertAdjacentListProperties(
knowsEdge,
AdjListType.ORDERED_BY_SOURCE,
FileType.CSV,
"ordered_by_source/",
"edge/person_knows_person//ordered_by_source//adj_list/vertex_count",
"edge/person_knows_person//ordered_by_source//adj_list/edge_count0",
"edge/person_knows_person//ordered_by_source//adj_list/edge_count4",
"edge/person_knows_person//ordered_by_source//adj_list",
"edge/person_knows_person//ordered_by_source//adj_list/chunk0",
"edge/person_knows_person//ordered_by_source//adj_list/chunk4",
"edge/person_knows_person//ordered_by_source//adj_list/offset"
);

Copilot uses AI. Check for mistakes.
@Thespica
Copy link
Contributor

Hi @yangxk1 , thanks for your PR!

Could you new a java-info CI and make the unit test run automatically?

@Thespica Thespica self-requested a review May 23, 2025 10:47
@yangxk1
Copy link
Contributor Author

yangxk1 commented May 23, 2025

Hi @yangxk1 , thanks for your PR!

Could you new a java-info CI and make the unit test run automatically?

Of course! I'll give it a try.

@yangxk1
Copy link
Contributor Author

yangxk1 commented May 28, 2025

Hi @Thespica , I added a new CI workflow and it's now passing. Please help review when possible.

# specific language governing permissions and limitations
# under the License.

name: GraphAr Info CI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can rename it to GraphAr Java-Info CI, also remind to rename file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

jobs:
test:
runs-on: ubuntu-22.04
if: ${{ !contains(github.event.pull_request.title, 'WIP') && github.event.pull_request.draft == false }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow running for draft PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the draft limit

Comment on lines 51 to 64
- name: Install dependencies
run: |
# install the latest arrow deb to test arrow
wget -c https://apache.jfrog.io/artifactory/arrow/"$(lsb_release --id --short | tr 'A-Z' 'a-z')"/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb \
-P /tmp/
sudo apt-get install -y /tmp/apache-arrow-apt-source-latest-"$(lsb_release --codename --short)".deb
sudo apt-get update -y
# TODO: ISSUE-241
sudo apt install -y libarrow-dev=17.0.0-1 \
libarrow-dataset-dev=17.0.0-1 \
libarrow-acero-dev=17.0.0-1 \
libparquet-dev=17.0.0-1
sudo apt-get install -y libboost-graph-dev ccache libcurl4-openssl-dev doxygen lcov
sudo apt-get install llvm-11 clang-11 lld-11 libclang-11-dev libz-dev -y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not need arrow or llvm or others dependencies here, the java-info module is implemented by pure java

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment on lines 68 to 78
private void testVertexInfo(GraphInfo graphInfo) {
VertexInfo personVertexInfo = graphInfo.getVertexInfos().get(0);
Assert.assertEquals("person", personVertexInfo.getType());
Assert.assertEquals(100, personVertexInfo.getChunkSize());
Assert.assertEquals("vertex/person/", personVertexInfo.getPrefix());
Assert.assertEquals(
"vertex/person//vertex_count",
personVertexInfo.getVerticesNumFilePath()); // one more '/'
Assert.assertEquals("vertex/person//person.vertex.yaml", personVertexInfo.getVertexPath());
Assert.assertNotNull(personVertexInfo.getPropertyGroups());
Assert.assertEquals(2, personVertexInfo.getPropertyGroups().size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we refactor the test context and make it more clear? (not only for testVertexInfo)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have refactored the test code by splitting it into multiple test methods. I think this is clearer

@Thespica Thespica requested a review from Copilot May 28, 2025 03:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive unit tests for the java-info module to verify proper loading of GraphInfo and related entities. It also includes minor code refactoring in the saver/loader classes and an update to the CI workflow and pom.xml configuration.

  • Added detailed assertions in GraphLoaderTest to validate metadata integrity.
  • Reformatted path concatenation in LocalYamlGraphSaver and cleaned up redundant imports in LocalYamlGraphLoader.
  • Updated googleJavaFormat version in pom.xml and added a GitHub Actions workflow for the info module.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
maven-projects/info/src/test/java/org/apache/graphar/info/GraphLoaderTest.java Adds comprehensive unit tests for GraphInfo, VertexInfo, EdgeInfo, and related properties.
maven-projects/info/src/main/java/org/apache/graphar/info/saver/LocalYamlGraphSaver.java Refactors path concatenation formatting.
maven-projects/info/src/main/java/org/apache/graphar/info/loader/LocalYamlGraphLoader.java Removes redundant imports; minor code cleanup.
maven-projects/info/pom.xml Updates googleJavaFormat version for improved code formatting consistency.
.github/workflows/info.yml Introduces a new CI workflow for the info module.
Comments suppressed due to low confidence (1)

maven-projects/info/src/test/java/org/apache/graphar/info/GraphLoaderTest.java:74

  • The expected file path contains a double slash, which might lead to confusion. Please verify if the extra separator is intentional or adjust the concatenation logic to ensure a cleaner, uniform path format.
Assert.assertEquals("vertex/person//vertex_count", personVertexInfo.getVerticesNumFilePath()); // one more '/'

@yangxk1 yangxk1 requested a review from Thespica May 28, 2025 09:17
@Thespica Thespica requested a review from Copilot May 29, 2025 07:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds unit tests and CI support for the Java info module, and refactors YAML loader/saver formatting.

  • Introduces comprehensive GraphInfoTest to validate metadata loading.
  • Refactors path‐building logic in LocalYamlGraphSaver and cleans up imports in LocalYamlGraphLoader.
  • Updates Google Java Format version and adds a GitHub Actions workflow for CI.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
maven-projects/info/src/test/java/org/apache/graphar/info/GraphInfoTest.java New unit tests covering GraphInfo, VertexInfo, EdgeInfo, and related classes
maven-projects/info/src/main/java/org/apache/graphar/info/saver/LocalYamlGraphSaver.java Reformats path concatenation for saving YAML files
maven-projects/info/src/main/java/org/apache/graphar/info/loader/LocalYamlGraphLoader.java Removes duplicate imports and simplifies constructor syntax
maven-projects/info/pom.xml Bumps googleJavaFormat plugin version from 11 to 1.7
.github/workflows/java-info.yml Adds CI workflow for building, formatting, and testing Java-Info
Comments suppressed due to low confidence (3)

maven-projects/info/src/main/java/org/apache/graphar/info/saver/LocalYamlGraphSaver.java:42

  • Files.createFile will fail if the file already exists. Consider using Files.newBufferedWriter(outputPath, StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING) or checking for existence first.
Files.createFile(outputPath);

maven-projects/info/src/test/java/org/apache/graphar/info/GraphInfoTest.java:53

  • [nitpick] The clean() method is empty and not doing any teardown. If no cleanup is needed, consider removing this method to reduce clutter.
@AfterClass
    public static void clean() {}

maven-projects/info/pom.xml:97

  • [nitpick] The Google Java Format plugin version was changed from 11 to 1.7. Please verify this matches the plugin’s supported versions to avoid build issues.
<version>1.7</version>

public void testGraphInfoBasics() {
Assert.assertNotNull(graphInfo);
Assert.assertEquals("ldbc_sample", graphInfo.getName());
Assert.assertEquals("", graphInfo.getPrefix()); // This shouldn't be an empty string
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion expects an empty prefix but the comment says it shouldn’t be empty. Please reconcile the expected value or update/remove the comment to match the intended behavior.

Suggested change
Assert.assertEquals("", graphInfo.getPrefix()); // This shouldn't be an empty string
Assert.assertEquals("ldbc_sample_prefix", graphInfo.getPrefix()); // The prefix should match the expected value

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should review whether returning an empty string is the intended behavior here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For vertex info and edge info, prefix means the path that store vertex and edge data. But for Graph Info, I'm not sure what's the meaning of prefix. BTW, test data ldbc_sample doesn't have prefix:
https://github.com/apache/incubator-graphar-testing/blob/955596c325ceba7b607e285738e3dd0ce4ff424e/ldbc_sample/csv/ldbc_sample.graph.yml#L1-L6

Comment on lines 74 to 76
"vertex/person//vertex_count",
personVertexInfo.getVerticesNumFilePath()); // one more '/'
Assert.assertEquals("vertex/person//person.vertex.yaml", personVertexInfo.getVertexPath());
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Double slashes appear in expected paths (e.g., "vertex/person//vertex_count"). Consider normalizing prefixes in GraphInfo to avoid duplicate separators.

Suggested change
"vertex/person//vertex_count",
personVertexInfo.getVerticesNumFilePath()); // one more '/'
Assert.assertEquals("vertex/person//person.vertex.yaml", personVertexInfo.getVertexPath());
"vertex/person/vertex_count",
personVertexInfo.getVerticesNumFilePath()); // normalized path
Assert.assertEquals("vertex/person/person.vertex.yaml", personVertexInfo.getVertexPath());

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for the code that splices the paths to fix it together

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add a TODO to mark that. add related issue link will be better

Comment on lines +35 to +41
final Path outputPath =
FileSystems.getDefault()
.getPath(
path
+ FileSystems.getDefault().getSeparator()
+ graphInfo.getName()
+ ".graph.yaml");
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Manual string concatenation with getSeparator() is verbose and error-prone. Consider using Paths.get(path, graphInfo.getName() + ".graph.yaml") for clarity and cross-platform safety.

Suggested change
final Path outputPath =
FileSystems.getDefault()
.getPath(
path
+ FileSystems.getDefault().getSeparator()
+ graphInfo.getName()
+ ".graph.yaml");
final Path outputPath = Paths.get(path, graphInfo.getName() + ".graph.yaml");

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's formatted by mvn Spotless

@yangxk1
Copy link
Contributor Author

yangxk1 commented May 30, 2025

Hi @Thespica , Is there anything else that needs to be modified in this PR?

public void testGraphInfoBasics() {
Assert.assertNotNull(graphInfo);
Assert.assertEquals("ldbc_sample", graphInfo.getName());
Assert.assertEquals("", graphInfo.getPrefix()); // This shouldn't be an empty string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For vertex info and edge info, prefix means the path that store vertex and edge data. But for Graph Info, I'm not sure what's the meaning of prefix. BTW, test data ldbc_sample doesn't have prefix:
https://github.com/apache/incubator-graphar-testing/blob/955596c325ceba7b607e285738e3dd0ce4ff424e/ldbc_sample/csv/ldbc_sample.graph.yml#L1-L6

Comment on lines 74 to 76
"vertex/person//vertex_count",
personVertexInfo.getVerticesNumFilePath()); // one more '/'
Assert.assertEquals("vertex/person//person.vertex.yaml", personVertexInfo.getVertexPath());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add a TODO to mark that. add related issue link will be better

@yangxk1
Copy link
Contributor Author

yangxk1 commented Jun 1, 2025

Thanks for the feedback @Thespica . I’ve created a new issue #698 to further discuss "prefix" & '/'.

@yangxk1 yangxk1 requested a review from Thespica June 4, 2025 05:11
Copy link
Contributor

@Thespica Thespica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~

@lixueclaire lixueclaire merged commit 3a16108 into apache:main Jun 9, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants