Skip to content

use lazyregexp for various regular expressions#25

Draft
thaJeztah wants to merge 1 commit intodistribution:mainfrom
thaJeztah:lazyregexp
Draft

use lazyregexp for various regular expressions#25
thaJeztah wants to merge 1 commit intodistribution:mainfrom
thaJeztah:lazyregexp

Conversation

@thaJeztah
Copy link
Copy Markdown
Member

Using regex.MustCompile consumes a significant amount of memory when importing the package, even if those regular expressions are not used.

This changes compiling the regular expressions to use a lazyregexp package so that they're only compiled the first time they're used.

There are various regular expressions remaining that are still compiled on import, but these are exported, so changing them to a sync.OnceValue would be a breaking change; we can still decide to do so, but leaving that for a follow-up.

To verify, compile a basic binary importing the package;

package main

import _ "github.com/distribution/reference"

func main() {}

Before:

for i in $(seq 1 5); do GODEBUG=inittrace=1 ./before 2>&1 | grep distribution/reference; done

init github.com/distribution/reference @0.94 ms, 0.22 ms clock, 415712 bytes, 3599 allocs
init github.com/distribution/reference @0.39 ms, 0.22 ms clock, 415712 bytes, 3599 allocs
init github.com/distribution/reference @0.39 ms, 0.23 ms clock, 415712 bytes, 3599 allocs
init github.com/distribution/reference @0.45 ms, 0.27 ms clock, 415712 bytes, 3599 allocs
init github.com/distribution/reference @0.38 ms, 0.24 ms clock, 415712 bytes, 3599 allocs

After:

for i in $(seq 1 5); do GODEBUG=inittrace=1 ./after 2>&1 | grep distribution/reference; done

init github.com/distribution/reference/internal/lazyregexp @0.85 ms, 0 ms clock, 0 bytes, 0 allocs
init github.com/distribution/reference @1.0 ms, 0.16 ms clock, 238680 bytes, 1383 allocs
init github.com/distribution/reference/internal/lazyregexp @0.33 ms, 0 ms clock, 0 bytes, 0 allocs
init github.com/distribution/reference @0.42 ms, 0.16 ms clock, 238680 bytes, 1383 allocs
init github.com/distribution/reference/internal/lazyregexp @0.39 ms, 0 ms clock, 0 bytes, 0 allocs
init github.com/distribution/reference @0.47 ms, 0.19 ms clock, 238680 bytes, 1383 allocs
init github.com/distribution/reference/internal/lazyregexp @0.36 ms, 0 ms clock, 0 bytes, 0 allocs
init github.com/distribution/reference @0.47 ms, 0.14 ms clock, 238680 bytes, 1383 allocs
init github.com/distribution/reference/internal/lazyregexp @0.29 ms, 0 ms clock, 0 bytes, 0 allocs
init github.com/distribution/reference @0.38 ms, 0.15 ms clock, 238680 bytes, 1383 allocs

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 17, 2026

Codecov Report

❌ Patch coverage is 65.00000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.02%. Comparing base (6ccba5a) to head (34ca3d7).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/lazyregexp/lazyregexp.go 65.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #25      +/-   ##
==========================================
- Coverage   84.21%   83.02%   -1.19%     
==========================================
  Files           5        6       +1     
  Lines         304      324      +20     
==========================================
+ Hits          256      269      +13     
- Misses         38       45       +7     
  Partials       10       10              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@thaJeztah thaJeztah force-pushed the lazyregexp branch 2 times, most recently from c9b0f97 to 7d6ee23 Compare April 18, 2026 15:22
Using regex.MustCompile consumes a significant amount of memory when
importing the package, even if those regular expressions are not used.

This changes compiling the regular expressions to use a lazyregexp
package so that they're only compiled the first time they're used.

There are various regular expressions remaining that are still compiled
on import, but these are exported, so changing them to a sync.OnceValue
would be a breaking change; we can still decide to do so, but leaving
that for a follow-up.

To verify, compile a basic binary importing the package;

    package main

    import _ "github.com/distribution/reference"

    func main() {}

Before:

    for i in $(seq 1 5); do GODEBUG=inittrace=1 ./before 2>&1 | grep distribution/reference; done

    init github.com/distribution/reference @0.94 ms, 0.22 ms clock, 415712 bytes, 3599 allocs
    init github.com/distribution/reference @0.39 ms, 0.22 ms clock, 415712 bytes, 3599 allocs
    init github.com/distribution/reference @0.39 ms, 0.23 ms clock, 415712 bytes, 3599 allocs
    init github.com/distribution/reference @0.45 ms, 0.27 ms clock, 415712 bytes, 3599 allocs
    init github.com/distribution/reference @0.38 ms, 0.24 ms clock, 415712 bytes, 3599 allocs

After:

    for i in $(seq 1 5); do GODEBUG=inittrace=1 ./after 2>&1 | grep distribution/reference; done

    init github.com/distribution/reference/internal/lazyregexp @0.85 ms, 0 ms clock, 0 bytes, 0 allocs
    init github.com/distribution/reference @1.0 ms, 0.16 ms clock, 238680 bytes, 1383 allocs
    init github.com/distribution/reference/internal/lazyregexp @0.33 ms, 0 ms clock, 0 bytes, 0 allocs
    init github.com/distribution/reference @0.42 ms, 0.16 ms clock, 238680 bytes, 1383 allocs
    init github.com/distribution/reference/internal/lazyregexp @0.39 ms, 0 ms clock, 0 bytes, 0 allocs
    init github.com/distribution/reference @0.47 ms, 0.19 ms clock, 238680 bytes, 1383 allocs
    init github.com/distribution/reference/internal/lazyregexp @0.36 ms, 0 ms clock, 0 bytes, 0 allocs
    init github.com/distribution/reference @0.47 ms, 0.14 ms clock, 238680 bytes, 1383 allocs
    init github.com/distribution/reference/internal/lazyregexp @0.29 ms, 0 ms clock, 0 bytes, 0 allocs
    init github.com/distribution/reference @0.38 ms, 0.15 ms clock, 238680 bytes, 1383 allocs

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant