Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using ratchetFrom in a branch fails to detect the previous index in origin/main #1372

Closed
bwgjoseph opened this issue Oct 17, 2022 · 18 comments
Closed
Assignees

Comments

@bwgjoseph
Copy link

Hi,

I'm using maven (wrapper) 3.8.4 with spotless-maven-plugin 2.27.1 with the configuration defined as follows

<configuration>
	<ratchetFrom>origin/main</ratchetFrom>
	<java>
		<palantirJavaFormat>
			<version>xxx</version>
		</palantirJavaFormat>
		<removedUnusedImports/>
	</java>
	<upToDateChecking>
		<enabled>true</enabled>
		<indexFile>${project.basedir/.spotless-index</indexFile>
	</upToDateChecking>
</configuration>

So I have been working on main for a while, and it already has a .spotless-index file with the checksum and the listing of all the files. So far, no issue with that. But recently, I created a branch off my main called feature-a, and after the first commit in feature-a branch, I ran spotless:apply, then the .spotless-index file now has a new checksum and only listing files that was changed in feature-a branch.

Imagine in main branch

ajoicjaicjoiajscoijasjc098-
// listing all my source files (say 50 lines)

Then in feature-a branch (after a commit, and ran spotless:apply)

osdjvoidjsoyfboudf0bd= // new checksum
// listing 3 files (those that was changed in this new branch commit)

There was a log indicating

[INFO] Up-to-date checking enabled
[INFO] Fingerprint mismatch in the index file. Fallback to an empty index

I can't have the full logs pasted as this occurs in an air-gapped repo, so if there is further information required, please let me know.

However, based on one of my previous feature branch, it doesn't seem to suffer the same issue. So I don't quite understand if there was something that I did wrong when creating the branch, or some mis-configuration on my end.

Thanks!

@bwgjoseph
Copy link
Author

I faced this today again but this time there was no branch.

I did everything in the main branch, added some files, updated some files. After running spotless, it gave the same error again

@lutovich
Copy link
Contributor

lutovich commented Nov 7, 2022

Thanks for reporting this problem and sorry for the delay! I will investigate this week.

@lutovich
Copy link
Contributor

Hi @bwgjoseph,

I played around with ratchetFrom and upToDateChecking configuration options and can confirm that the index file might not list all files when both options are set. Option ratchetFrom limits the files Spotless plugins processes. Option upToDateChecking receives files allowed by ratchetFrom and updates the index.

The first line in the index file is fingerprint. It is based on:

  • plugin's configuration (basically serialized <plugin>...</plugin> section of the pom.xml)
  • formatter configurations (palantirJavaFormat and removedUnusedImports in your example)

The fingerprint will change if you bump spotless-maven-plugin version, add a new formatter, or change version of the palantirJavaFormat. Thus, fingerprint mismatch is not something bad and could happen when the plugin or its formatters are reconfigured.

Have spotless failed to format files or detect unformatted files when you switched branched? Or was Fingerprint mismatch... message the only problematic bit?

@bwgjoseph
Copy link
Author

Thanks, @lutovich for looking into it.

The fingerprint will change if you bump spotless-maven-plugin version, add a new formatter, or change version of the palantirJavaFormat. Thus, fingerprint mismatch is not something bad and could happen when the plugin or its formatters are reconfigured.

Should this be the case? Because by updating the config/version, it loses all the tracked formatted files. Which loses the meaning of upToDateChecking, isn't it?

Have spotless failed to format files or detect unformatted files when you switched branched?

No, or at least none that I can see. As also mentioned above, this happens even when I did not switch branch. And after your explanation, I will add that I did not make configuration change the pom section, or at least, what was configured for spotless.

To further clarify, you mentioned serialized the entire <plugin>...</plugin> section, which means that any changes to other plugin configuration will be affected as well?

@lutovich
Copy link
Contributor

@bwgjoseph fingerprint has to change when plugin version or formatter configuration changes. After a configuration change, the plugin no longer knows if the tracked files are correctly formatted and needs to re-process them. I think Spotless plugin configuration changes should be relatively infrequent and index invalidations should not affect performance.

Fingerprint includes serialized org.apache.maven.model.Plugin object. It is a Java value object that corresponds to <plugin>...</plugin>. The relevant code is here. Any change to spotless-maven-plugin version, dependencies, or configuration will invalidate the index. I think it should not be a problem, especially because you are also using ratchetFrom.

No, or at least none that I can see. As also mentioned above, this happens even when I did not switch branch. And after your explanation, I will add that I did not make configuration change the pom section, or at least, what was configured for spotless.

It is good to know that the plugin formats correctly. Do you remember if you've added/changed any non-spotless dependencies?

@bwgjoseph
Copy link
Author

After a configuration change, the plugin no longer knows if the tracked files are correctly formatted and needs to re-process them.

So to say that the index file, can always be destroyed and rebuild without affecting performance? And to further clarify (taken from README)

With up-to-date checking enabled, Spotless creates an index file in the target directory. The index file contains source file paths and corresponding last modified timestamps. It allows Spotless to skip already formatted files that have not changed.

That means if I don't have this enabled, (without having the index file), every time I run check/apply, it will do so for the entire codebase?

Any change to spotless-maven-plugin version, dependencies, or configuration will invalidate the index

Ah, I thought that you were saying that any changes to the entire <plugins>. But this is weird too, because I did not change anything within the spotless plugin, but I did make changes to other part of the pom.xml such as adding new dependencies and configuration. But I cannot remember if the fingerprint mismatched after I added the new dependencies, as I did not take extra note on it.

I could try to add new dependencies/configuration to and see if the fingerprint will mismatch again.

@bwgjoseph
Copy link
Author

I can verify that the fingerprint changes when I make changes to pom.xml and not within the section of spotless. So I added new dependency, and make some changes in plugin section of maven-compiler-plugin and run spotless:apply, right after added, it will cause the fingerprint to detect as mismatch.

@bwgjoseph
Copy link
Author

Would like to know if the index file is meant to, or is it ok to keep resetting?

Especially, like what is the flow for running spotless in a branch? For example, for personA to work on main, and personB to work on branchB, how do I expect the merge to happen if personB make some pom changes, and affect the fingerprint.

@lutovich
Copy link
Contributor

Thanks for checking this. I'm not sure PluginFingerprint needs to be this sensitive to dependency changes. I'll investigate this.

The index file keeps track of well-formatted files for a particular Spotless plugin configuration (represented by PluginFingerprint). It is perfectly fine for the index file to be invalidated when Spotless plugin configuration changes. Thus, Fingerprint mismatch... messages are expected to happen and are not harmful. However, index invalidations will affect Spotless plugin's performance and PluginFingerprint needs to capture the minimal possible configuration. I'll investigate if that's the case with dependencies.

@bwgjoseph
Copy link
Author

I understand that Fingerprint mismatch ... is to be expected when I change spotless configuration, and I generally agree with the approach as you have explained previously.

But I'm more concerned on how to use spotless when working with others, and what to expect. Given that, this index file can change quite frequently (with the current setup) and whatever tracked changes may frequently get reset, and thus, making the index file pretty much "not as useful" as one might have thought.

I think as you have explained before, fingerprint mismatch should only happen when spotless plugin configuration changes and not any other configuration within pom.xml which should greatly reduce the chance for the index file to keep resetting.

And I still have concern when using with large codebase, if the index file resets, how much impact would it make (as it needs to rebuild the index again each time spotless version gets upgraded - non-configuration change)? And how would it work in a monorepo (maven multi-module) project setup

Thanks for investigating, and answering to my queries. Appreciate it.

@lutovich
Copy link
Contributor

lutovich commented Jan 7, 2023

Hi @bwgjoseph,

Sorry for the delay! I created a PR to make the plugin reset the index only when plugin version changes or when its configuration changes: #1461.

In a multi-module project setup each module has a separate index file. A version or configuration change in the parent POM will invalidate index files in all modules. I think plugin upgrades and re-configurations are relatively infrequent compared to code changes that need to run mvn spotless:apply. Invalidations will occasionally happen but the up-to-date index is still useful to optimize the most common use case or making code changes + running the formatter.

When you pull changes made by other developers, the changed files will have a different last-modified timestamp compared to your local index entries. It means Spotless will have to re-format those files. I think if you pull often, then performance impact should be limited. However, if you pull and most of the codebase has changed then Spotless will have to re-format/re-check formatting for the entire codebase.

@bwgjoseph
Copy link
Author

Thanks for the PR!

I think what you described make sense, and with the PR, it should make the invalidation less frequent.

I currently have a multi-module project with the following setup.

project/
├── build-tools /
│   ├── src/
│   │   └── main/
│   │       └── resources/
│   │           └── somefiles.xml
│   └── pom.xml // no spotless config
├── parent-project/
│   ├── child-project/
│   │   ├── src/
│   │   │   ├── main
│   │   │   └── test
│   │   ├── .spotless-index
│   │   └── pom.xml
│   ├── .spotless-index
│   └── pom.xml // contain spotless config
└── pom.xml

The parent-project and child-project will generate the index file, but since parent-project contains no source or anything, the index file pretty much is just the checksum. As for the child-project, as it contains all the source code, the index file contains the checksum together with the list of files.

Is that right? Also, to verify, the parent-project will generate index file with empty listing but only the checksum?

When you pull changes made by other developers, the changed files will have a different last-modified timestamp compared to your local index entries.

In terms of PR, it is the responsibility of the developer to ensure the latest index file is also committed, is that right?

@lutovich
Copy link
Contributor

lutovich commented Jan 8, 2023

I think it makes sense that the index file is empty in parent-project and non-empty in the child-project. Spotless plugin is enabled for both parent-project and child-project so it creates the index file in both. You could enable up-to-date checking only in child-project POM so that the index file is not created in parent-project. Or even enable Spotless only in child-project if it is the only module that contains source code. I'm not sure it would be an improvement TBH :)

I don't think it's a good idea to commit the index file. It would be updated for every code change and will probably be a constant source of merge conflicts.

By default, Spotless puts the index file in the target directory which is typically git-ignored. Why do you configure an explicit <indexFile>${project.basedir}/.spotless-index</indexFile> to be outside of target?

@bwgjoseph
Copy link
Author

I don't think it's a good idea to commit the index file.

Somehow, I always had the impression to commit the index file so that the formatter does not need to re-run. But I never thought the other way round that if every developer had run at least once, they would have the updated index file anyway.

The reason why I wanted it to be outside the target is because each time I run mvn clean command, the index file will be flushed and requires a re-build, that seem to be ineffective to me. This way, the file (even not committed), will only re-build if and only if the configuration/version changes.

Am I not right in that? Did I misunderstand a certain concept? I don't think I have exceptionally huge repo, it will grow but still manageable size. Do you think that letting it rebuild each time clean is run, is going to be fine or even recommended?

Thanks!

@lutovich
Copy link
Contributor

lutovich commented Jan 8, 2023

Sounds like a fair reason to put the index file outside of target and git-ignore it 👍

I guess it comes down to how your team runs Maven commands. If a typical command includes clean then it does make sense to place the index file outside of target.

@bwgjoseph
Copy link
Author

Nowadays, I don't think clean is required anymore, but I think it's going to be hard to educate all developers, or sometimes the command will be included out of habit.

But I'll take your suggestion to git-ignore it.


Given that the PR is merged, any chance of releasing new version soon?

@nedtwigg
Copy link
Member

nedtwigg commented Jan 8, 2023

A new release will be cut in a few days, and I post the release version into all affected issues and pr's so you'll get a notification if you're subscribed to this issue.

@nedtwigg nedtwigg closed this as completed Jan 8, 2023
@nedtwigg
Copy link
Member

Published in plugin-maven 2.30.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants