Row Name Extraction for data.table() with keep.rownames#7136
Row Name Extraction for data.table() with keep.rownames#7136MichaelChirico merged 45 commits intomasterfrom
Conversation
|
Generated via commit 6dbc414 Download link for the artifact containing the test results: ↓ atime-results.zip
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #7136 +/- ##
=======================================
Coverage 98.50% 98.50%
=======================================
Files 81 81
Lines 15016 15032 +16
=======================================
+ Hits 14792 14808 +16
Misses 224 224 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hey @Mukulyadav2004, please ping when tests are passing, or if you're stuck & need an extra pair of eyes. Thanks! |
|
Thanks @MichaelChirico ! These both tests are passing when I run on Rstudio but can't figure out why they are failing here. |
R/as.data.table.R
Outdated
| if (any(nzchar(valid_names))) { | ||
| vector_rownames = valid_names | ||
| x[[i]] = unname(xi) | ||
| break |
There was a problem hiding this comment.
I am leaning towards merging this logic into the below loop. We can write check_rownames = !isFALSE(keep.rownames) and then the marginal cost of checking if (check_rownames && ...) is low. We can replace the early break with checking is.null(vector_rownames). WDYT?
inst/tests/tests.Rraw
Outdated
| x <- c(1, 2, 3) | ||
| y <- setNames(c(4, 5, 6), c("A", "B", "C")) | ||
| test(2329.1, as.data.table(list(x, y), keep.rownames=TRUE), data.table(rn=c("A", "B", "C"), V1=c(1, 2, 3), V2=c(4, 5, 6))) | ||
| test(2329.2, as.data.table(list(x, y), keep.rownames="custom"), data.table(custom=c("A", "B", "C"), V1=c(1, 2, 3), V2=c(4, 5, 6))) |
There was a problem hiding this comment.
Thanks! Let's make the test suite more extensive:
- Test also
list(y, x) - Test the behavior under
data.frame(), not justas.data.table - Test your condition about
any(nzchar(valid_names))
I also don't think we've matched data.frame behavior yet, c.f.
DF = data.frame(row.names = letters, V = 1:26)
head(rownames(data.frame(a = 26:1, DF)))
# [1] "a" "b" "c" "d" "e" "f"
You were affected by an underlying change in |
|
@Mukulyadav2004 I'm curious about the Observe that we already can produce M = cbind(1:3)
rownames(M) = rep("", 3L)
as.data.table(M, keep.rownames='blank')
# blank V1
# 1: 1
# 2: 2
# 3: 3Maybe we should drop that condition here too? |
…into issue_1916
|
I included the |
|
Thanks! The data.frame behavior makes sense, but yes, let's break from that. consistency within the data.table API is more important. |
|
learned a lot from your changes, thank you. |
| test(2330.5, as.data.table(data.frame(y, x), keep.rownames=TRUE), data.table(rn=c("A", "B", "C"), y=4:6, x=1:3)) | ||
|
|
||
| DF <- data.frame(row.names = letters[1:6], V = 1:6) # Test data.frame with explicit rownames | ||
| test(2330.6, as.data.table(list(a = 6:1, DF), keep.rownames=TRUE), data.table(rn=letters[1:6], a=6:1, V=1:6)) |
There was a problem hiding this comment.
rather than remove the rest with all-empty names, let's test the expected behavior in that case as well.
Please also add a test of list(M) for empty-rowname'd matrix input
MichaelChirico
left a comment
There was a problem hiding this comment.
Great, thank you! I'm going to start using this right away!!

fixes #1916
Implements row name extraction from named vectors in
data.table()andas.data.table()calls whenkeep.rownames=TRUEorkeep.rownames="column_name". This matches the behavior ofdata.frame()andas.data.frame.list()by extracting names from the first named atomic vector in the input.Problem : Currently,
data.table()andas.data.table()do not extract row names from named vectors likedata.frame().So added logic to
as.data.table.list()(the common path for bothdata.table()andas.data.table()) to:@MichaelChirico @tdhock can you please review.