Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table spec changes for statistics information in table snapshot #4945

Merged
merged 1 commit into from Jul 26, 2022

Conversation

findepi
Copy link
Collaborator

@findepi findepi commented Jun 2, 2022

No description provided.

@findepi
Copy link
Collaborator Author

findepi commented Jun 2, 2022

cc @alexjo2144 @RussellSpitzer

@findepi findepi changed the title Findepi/stats in table spec Add statistics information in table snapshot Jun 2, 2022
@findepi
Copy link
Collaborator Author

findepi commented Jun 2, 2022

Currently based on #4944

@findepi
Copy link
Collaborator Author

findepi commented Jun 2, 2022

Supersedes apache/iceberg-docs#77 (just for reference, no useful content/comments there)

format/puffin-spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
@findepi findepi force-pushed the findepi/stats-in-table-spec branch from 405fe41 to 5599290 Compare June 6, 2022 20:59
format/puffin-spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
@findepi findepi force-pushed the findepi/stats-in-table-spec branch from 5599290 to 7a27f31 Compare June 7, 2022 07:29
@findepi
Copy link
Collaborator Author

findepi commented Jun 7, 2022

comments applied. @rdblue mind taking another look?

@findepi findepi force-pushed the findepi/stats-in-table-spec branch 2 times, most recently from cc66021 to 57f3f73 Compare June 7, 2022 07:37
@findepi findepi force-pushed the findepi/stats-in-table-spec branch from 57f3f73 to 26da01e Compare June 7, 2022 07:47
@findepi findepi changed the title Add statistics information in table snapshot Add table spec changes for statistics information in table snapshot Jun 7, 2022
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated
| _required_ | **`statistics-path`** | `string` | Path of the statistics file. See [Puffin file format](../puffin-spec). |
| _required_ | **`file-size-in-bytes`** | `long` | Size of the statistics file. |
| _required_ | **`file-footer-size-in-bytes`** | `long` | Size of the statistics file's footer. See [Puffin file format](../puffin-spec) for footer definition. |
| _required_ | **`source-sequence-number`** | `long` | Table sequence number at which the stats were calculated |
Copy link
Contributor

@rdblue rdblue Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this also include a snapshot ID at which the stats were calculated?

Also, if this is to be used in v1, we need to rely on snapshot ID rather than sequence number because v1 tables use a null sequence number.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if this is to be used in v1,

my intention was for this to be v2 feature.

Shouldn't this also include a snapshot ID at which the stats were calculated?

also, or instead of?

please advise

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this should have the snapshot for at which stats were calculated.

There's also no reason to not support this in v1. As long as the snapshot ID is here for tracking, it should work just fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to source-snapshot-id and made the whole thing supported in v1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A side question -- now that we have v2 and we will have v3 of the spec, why do we want to add a new feature to v1?

@findepi findepi force-pushed the findepi/stats-in-table-spec branch from 8125ad7 to 5b2d4ee Compare July 22, 2022 13:57
format/spec.md Outdated Show resolved Hide resolved
@findepi findepi force-pushed the findepi/stats-in-table-spec branch from 5b2d4ee to 528eb9c Compare July 22, 2022 14:23
@github-actions github-actions bot added the core label Jul 22, 2022
@findepi
Copy link
Collaborator Author

findepi commented Jul 22, 2022

@findepi, can you update this to add the stats structure that we're discussing in the thread?

Done (#4945 (comment))

@findepi findepi requested review from rdblue and removed request for rdblue July 22, 2022 14:24
format/puffin-spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
@findepi findepi force-pushed the findepi/stats-in-table-spec branch from 528eb9c to 01097d8 Compare July 25, 2022 15:25
@findepi
Copy link
Collaborator Author

findepi commented Jul 25, 2022

applied/answered the comments

@findepi
Copy link
Collaborator Author

findepi commented Jul 25, 2022

Thank you @rdblue for your detailed review.

@findepi findepi requested a review from rdblue July 25, 2022 15:26
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
format/spec.md Outdated Show resolved Hide resolved
@findepi findepi force-pushed the findepi/stats-in-table-spec branch from 01097d8 to 9737dc3 Compare July 26, 2022 13:41
@findepi
Copy link
Collaborator Author

findepi commented Jul 26, 2022

Thank you @rdblue for your detailed review. Applied comments!

@findepi findepi requested a review from rdblue July 26, 2022 13:43
@rdblue rdblue merged commit 4687798 into apache:master Jul 26, 2022
@rdblue
Copy link
Contributor

rdblue commented Jul 26, 2022

Merged! Thanks for all your work to get this in, @findepi!

@findepi findepi deleted the findepi/stats-in-table-spec branch July 27, 2022 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants