How to load big dataset to new database #3701

maxx-ukoo · 2026-01-15T05:57:30Z

maxx-ukoo
Jan 15, 2026

I am going to load the PubChem dataset (https://ftp.ncbi.nlm.nih.gov/pubchem/RDF/, https://pubchem.ncbi.nlm.nih.gov/docs/rdf-load).
I start Fuseki, stop it, and then try to upload the data into the dataset directory using the tdb2.tdbloader utility.
I have a few questions:

What is the correct and fastest way to load this dataset?
Should I unzip the files before loading them?
Should I load the files one by one, or load all of them in a single tdb2.tdbloader run?
Why does the performance drop dramatically after a few million records?

05:06:54 INFO  loader          :: Add: 4,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 166,917 / Avg: 181,167)
05:07:22 INFO  loader          :: Add: 5,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 35,449 / Avg: 99,427)
05:09:07 INFO  loader          :: Add: 6,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 9,560 / Avg: 38,737)
05:11:19 INFO  loader          :: Add: 7,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 7,545 / Avg: 24,355)
05:13:46 INFO  loader          :: Add: 8,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 6,839 / Avg: 18,449)
05:16:55 INFO  loader          :: Add: 9,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 5,270 / Avg: 14,437)
05:19:37 INFO  loader          :: Add: 10,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 6,190 / Avg: 12,740)
05:19:37 INFO  loader          ::   Elapsed: 784.91 seconds [2026/01/15 05:19:37 UTC]
05:23:21 INFO  loader          :: Add: 11,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 4,456 / Avg: 10,898)
05:28:20 INFO  loader          :: Add: 12,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 3,350 / Avg: 9,175)
05:36:15 INFO  loader          :: Add: 13,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 2,104 / Avg: 7,290)
05:44:10 INFO  loader          :: Add: 14,000,000 pc_compound2defined_atom_stereo_count_000005.ttl.gz (Batch: 2,105 / Avg: 6,200)```

afs · 2026-01-15T10:53:37Z

afs
Jan 15, 2026
Collaborator

Hi @maxx-ukoo

What is the correct and fastest way to load this dataset?

Why does the performance drop dramatically after a few million records?

That is a surprising sharp drop off. What kind of storage are you using? (local SSD? local disk? A remote file store? ...)
And what is the loader command you are using?

Should I unzip the files before loading them?
No need. The parsers decompress and at scale it does not seem to impact loading times which are dominated by writing.

Should I load the files one by one, or load all of them in a single tdb2.tdbloader run?

All in a single run. The loaders all exploit the fact the database is empty and manipulate at a low level based on that. Else loading is unoptimized.

0 replies

maxx-ukoo · 2026-01-15T11:37:34Z

maxx-ukoo
Jan 15, 2026
Author

Hi @afs
I have tested two Azure VMs with 4 (8) CPU cores and 16 (32) GB RAM, using two types of SSD disks—one with a 500 IOPS limit and another with 5000 IOPS. There is no significant performance difference between these VMs or the disk types.
The database is new, but not empty. My steps are:

Start Fuseki; it initializes and populates the data folder with initial data.
Stop the Fuseki server.
Upload the OWL file with the command:

/$DATA_SSD/apache-jena/bin/apache-jena-5.6.0/bin/tdb2.tdbloader \
  --loc="$DATA_SSD/server/data" \
  --graph="http://rdf.ncbi.nlm.nih.gov/pubchem/ruleset" \
  "$DATA_SSD/source/chebi.owl"

Upload the data using commands like:

 /data/apache-jena/bin/apache-jena-5.6.0/bin/tdb2.tdbloader \
  --verbose \
  --loader=parallel \
  --loc="/data/server/data/" \
  --graph="http://rdf.ncbi.nlm.nih.gov/pubchem/compound" \
"/data/source/compound/general/pc_compound2defined_atom_stereo_count_000006.ttl.gz" \
...

I tried processing up to 20 files in a batch.
Command run with export JAVA_TOOL_OPTIONS="-Xmx28g -Xms4g" and I see message about 30G avaulable for loader.

So technically, the database is not empty: it contains 8,806,955 triples in the http://rdf.ncbi.nlm.nih.gov/pubchem/ruleset graph after loading the OWL file.
I need to upload a set of 400+ files into one graph and a set of 600+ files into a second graph. I cannot upload all files using a single command because of the “Argument list too long” error. Uploading the second graph also needs to work with the existing database.
Could you suggest how to upload 400+ / 600+ .ttl.gz files into two separate graphs? Uploading using Java is also fine for me if that is possible.

0 replies

afs · 2026-01-15T15:23:51Z

afs
Jan 15, 2026
Collaborator

One point : "-Xmx28g -Xms4g"

The way TDB works, if you run with a large heap it will slow down loading. TDB uses the OS file system accessing and caching indexing files. This isn't heap but a large heap take space from the OS. Try "-Xmx4g -Xms4g".

0 replies

afs · 2026-01-17T12:10:32Z

afs
Jan 17, 2026
Collaborator

That is a surprising sharp drop off.

Could this be the VM I/O been throttled?

The loaders all exploit the fact the database is empty and manipulate at a low level based on that.

Correction -- --loader=parallel does work with a database with existing data.

0 replies

afs · 2026-01-17T12:20:29Z

afs
Jan 17, 2026
Collaborator

I'm having problems with PubChem - after downloading all the files (FTP, as described on the website) some of the files are corrupt gz files - about 5% of the files. Retrying usually gets a valid file but at least one needed 3 attempts.

I also found some syntax errors but with the gz problems, it isn't yet clear whether they are related or whether the files really are illegal Turtle.

Syntax errors are a nuisance when bulk loading. It's hard to know what is actually in the database and harder to find and fix it.

Are you trying to load all 2065 files?
I did see some with very long literals (the text of patent abstracts, IIRC).

2 replies

afs Jan 18, 2026
Collaborator

2065 files, 28B triples, one bad IRI, 3926 uses of scientific notation (9.6e+02) used in xsd:decimals.

The bad URI:
measuregroup/pc_measuregroup2participant.ttl.gz:

measuregroup:AID2060973_PMID	obo:RO_0000057	<http://rdf.ncbi.nlm.nih.gov/pubchem/protein/ACC[1-18> ,
		protein:ACC20-745 ,
		protein:ACC751-121 .

[ is illegal in a IRI path.

With valid gzip files, I didn't get parse-breaking syntax errors. This is slightly worrying because such errors were occurring and gzip wasn't always signalling an error. So there may have been corruption that wasn't caught.

maxx-ukoo Jan 18, 2026
Author

No, I am not trying to load all 2,065 files. Right now, I am loading 496 files related to the PubChem compound graph at http://rdf.ncbi.nlm.nih.gov/pubchem/compound. . I started again from scratch: I loaded the OWL file with the command above into an empty directory and then ran the load process file by file. The loading initially ran at a speed of 20–50k triples per second, but after about 35 files, the speed dropped to less than 100 records per second. You can see the log below.
screenlog.0.txt

afs · 2026-01-19T16:31:21Z

afs
Jan 19, 2026
Collaborator

10:57:31 INFO loader :: Loader = LoaderBasic

That's the "basic" loader.
To see if there is an I/O issues, I tried a small load. In your setups, do you get similar results with the parallel loader?

There is tdb2.xloader which loads an empty dataset with the contents of multiple files. This loader has been used to load datasets larger than "compounds" (which is 6B triples). While it starts slower, it is less I/O intensive.

Loading one file of compounds files (pc_compound2component.ttl.gz, 27M triples) with parallel loader on a machine with local SSD. (Ubuntu 25.10, NVMe SSD, i7-12700 : 4 P cores, 16 E cores).

Script:

#!/usr/bin/bash

( 
  date -Iseconds
  echo
  time tdb2.tdbloader --loader=parallel --loc ~/DatasetsSSD/DB ~/Datasets/PubChem/Data-2026-01/compound/general/pc_compound2component.ttl.gz
  echo
  date -Iseconds
)

2026-01-18T14:54:38+00:00

14:54:38 INFO  loader          :: Loader = LoaderParallel
14:54:38 INFO  loader          :: Start: /home/afs/Datasets/PubChem/Data-2026-01/compound/general/pc_compound2component.ttl.gz
14:54:40 INFO  loader          :: Add: 1,000,000 pc_compound2component.ttl.gz (Batch: 596,658 / Avg: 596,658)
14:54:42 INFO  loader          :: Add: 2,000,000 pc_compound2component.ttl.gz (Batch: 508,388 / Avg: 548,998)
14:54:46 INFO  loader          :: Add: 3,000,000 pc_compound2component.ttl.gz (Batch: 265,463 / Avg: 404,858)
14:54:50 INFO  loader          :: Add: 4,000,000 pc_compound2component.ttl.gz (Batch: 248,077 / Avg: 349,619)
14:54:54 INFO  loader          :: Add: 5,000,000 pc_compound2component.ttl.gz (Batch: 234,466 / Avg: 318,349)
14:54:59 INFO  loader          :: Add: 6,000,000 pc_compound2component.ttl.gz (Batch: 221,778 / Avg: 296,809)
14:55:03 INFO  loader          :: Add: 7,000,000 pc_compound2component.ttl.gz (Batch: 230,574 / Avg: 285,109)
14:55:07 INFO  loader          :: Add: 8,000,000 pc_compound2component.ttl.gz (Batch: 258,464 / Avg: 281,482)
14:55:11 INFO  loader          :: Add: 9,000,000 pc_compound2component.ttl.gz (Batch: 247,524 / Avg: 277,255)
14:55:15 INFO  loader          :: Add: 10,000,000 pc_compound2component.ttl.gz (Batch: 242,541 / Avg: 273,343)
14:55:15 INFO  loader          ::   Elapsed: 36.58 seconds [2026/01/18 14:55:15 GMT]
14:55:19 INFO  loader          :: Add: 11,000,000 pc_compound2component.ttl.gz (Batch: 247,402 / Avg: 270,762)
14:55:23 INFO  loader          :: Add: 12,000,000 pc_compound2component.ttl.gz (Batch: 247,831 / Avg: 268,690)
14:55:27 INFO  loader          :: Add: 13,000,000 pc_compound2component.ttl.gz (Batch: 255,232 / Avg: 267,605)
14:55:31 INFO  loader          :: Add: 14,000,000 pc_compound2component.ttl.gz (Batch: 276,472 / Avg: 268,219)
14:55:34 INFO  loader          :: Add: 15,000,000 pc_compound2component.ttl.gz (Batch: 277,623 / Avg: 268,826)
14:55:38 INFO  loader          :: Add: 16,000,000 pc_compound2component.ttl.gz (Batch: 284,333 / Avg: 269,746)
14:55:41 INFO  loader          :: Add: 17,000,000 pc_compound2component.ttl.gz (Batch: 282,485 / Avg: 270,463)
14:55:45 INFO  loader          :: Add: 18,000,000 pc_compound2component.ttl.gz (Batch: 268,817 / Avg: 270,371)
14:55:48 INFO  loader          :: Add: 19,000,000 pc_compound2component.ttl.gz (Batch: 298,240 / Avg: 271,708)
14:55:52 INFO  loader          :: Add: 20,000,000 pc_compound2component.ttl.gz (Batch: 302,388 / Avg: 273,093)
14:55:52 INFO  loader          ::   Elapsed: 73.24 seconds [2026/01/18 14:55:52 GMT]
14:55:55 INFO  loader          :: Add: 21,000,000 pc_compound2component.ttl.gz (Batch: 291,715 / Avg: 273,926)
14:55:58 INFO  loader          :: Add: 22,000,000 pc_compound2component.ttl.gz (Batch: 300,300 / Avg: 275,024)
14:56:02 INFO  loader          :: Add: 23,000,000 pc_compound2component.ttl.gz (Batch: 304,136 / Avg: 276,173)
14:56:05 INFO  loader          :: Add: 24,000,000 pc_compound2component.ttl.gz (Batch: 316,155 / Avg: 277,636)
14:56:08 INFO  loader          :: Add: 25,000,000 pc_compound2component.ttl.gz (Batch: 288,184 / Avg: 278,043)
14:56:12 INFO  loader          :: Add: 26,000,000 pc_compound2component.ttl.gz (Batch: 298,507 / Avg: 278,778)
14:56:15 INFO  loader          :: Add: 27,000,000 pc_compound2component.ttl.gz (Batch: 276,778 / Avg: 278,703)
14:56:18 INFO  loader          :: Finished: /home/afs/Datasets/PubChem/Data-2026-01/compound/general/pc_compound2component.ttl.gz: 27,762,031 tuples in 99.10s (Avg: 280,152)
14:56:22 INFO  loader          :: Finish - index SPO
14:56:22 INFO  loader          :: Finish - index OSP
14:56:22 INFO  loader          :: Finish - index POS
14:56:22 INFO  loader          :: Time = 103.625 seconds : Triples = 27,762,031 : Rate = 267,909 /s

real	1m44.283s
user	4m12.957s
sys	0m20.461s

2026-01-18T14:56:22+00:00

1 reply

maxx-ukoo Jan 20, 2026
Author

Currently, I have a database with 35 files loaded.
The database has size approximately 400+ GB, and it contains 395,085,113 triples.
The parallel loader works very slowly on this database:

2026-01-19T16:37:26+00:00

Picked up JAVA_TOOL_OPTIONS: -Xmx8g -Xms8g
Java maximum memory: 8589934592
symbol:http://jena.apache.org/ARQ#regexImpl = symbol:http://jena.apache.org/ARQ#javaRegex
symbol:http://jena.apache.org/ARQ#registryFunctions = org.apache.jena.sparql.function.FunctionRegistry@60641ec8
symbol:http://jena.apache.org/ARQ#constantBNodeLabels = true
symbol:http://jena.apache.org/ARQ#registryPropertyFunctions = org.apache.jena.sparql.pfunction.PropertyFunctionRegistry@75f65e45
symbol:http://jena.apache.org/ARQ#stageGenerator = org.apache.jena.tdb2.solver.StageGeneratorDirectTDB@6eeade6c
symbol:http://jena.apache.org/ARQ#enablePropertyFunctions = true
symbol:http://jena.apache.org/ARQ#registryServiceExecutors = org.apache.jena.sparql.service.ServiceExecutorRegistry@4a891c97
symbol:http://jena.apache.org/ARQ#strictSPARQL = false
16:37:26 INFO  loader          :: Loader = LoaderParallel
16:37:26 INFO  loader          :: Start: /data/source/compound/general/pc_compound2component.ttl.gz
16:37:29 INFO  loader          :: Add: 1,000,000 pc_compound2component.ttl.gz (Batch: 344,234 / Avg: 344,234)
16:47:34 INFO  loader          :: Add: 2,000,000 pc_compound2component.ttl.gz (Batch: 1,652 / Avg: 3,288)
17:00:16 INFO  loader          :: Add: 3,000,000 pc_compound2component.ttl.gz (Batch: 1,313 / Avg: 2,190)
17:24:05 INFO  loader          :: Add: 4,000,000 pc_compound2component.ttl.gz (Batch: 699 / Avg: 1,429)
18:21:36 INFO  loader          :: Add: 5,000,000 pc_compound2component.ttl.gz (Batch: 289 / Avg: 800)
19:21:23 INFO  loader          :: Add: 6,000,000 pc_compound2component.ttl.gz (Batch: 278 / Avg: 609)
20:31:19 INFO  loader          :: Add: 7,000,000 pc_compound2component.ttl.gz (Batch: 238 / Avg: 498)
21:59:10 INFO  loader          :: Add: 8,000,000 pc_compound2component.ttl.gz (Batch: 189 / Avg: 414)
23:34:39 INFO  loader          :: Add: 9,000,000 pc_compound2component.ttl.gz (Batch: 174 / Avg: 359)
02:07:46 INFO  loader          :: Add: 10,000,000 pc_compound2component.ttl.gz (Batch: 108 / Avg: 292)
02:07:46 INFO  loader          ::   Elapsed: 34,219.43 seconds [2026/01/20 02:07:46 UTC]
05:36:22 INFO  loader          :: Add: 11,000,000 pc_compound2component.ttl.gz (Batch: 79 / Avg: 235)
^C

However, when I run the loader on a clean (empty) database folder on the same hardware, it works much faster:

Picked up JAVA_TOOL_OPTIONS: -Xmx8g -Xms8g
Java maximum memory: 8589934592
symbol:http://jena.apache.org/ARQ#regexImpl = symbol:http://jena.apache.org/ARQ#javaRegex
symbol:http://jena.apache.org/ARQ#registryFunctions = org.apache.jena.sparql.function.FunctionRegistry@60641ec8
symbol:http://jena.apache.org/ARQ#constantBNodeLabels = true
symbol:http://jena.apache.org/ARQ#registryPropertyFunctions = org.apache.jena.sparql.pfunction.PropertyFunctionRegistry@75f65e45
symbol:http://jena.apache.org/ARQ#stageGenerator = org.apache.jena.tdb2.solver.StageGeneratorDirectTDB@6eeade6c
symbol:http://jena.apache.org/ARQ#enablePropertyFunctions = true
symbol:http://jena.apache.org/ARQ#registryServiceExecutors = org.apache.jena.sparql.service.ServiceExecutorRegistry@4a891c97
symbol:http://jena.apache.org/ARQ#strictSPARQL = false
07:54:19 INFO  loader          :: Loader = LoaderParallel
07:54:19 INFO  loader          :: Start: /data/source/compound/general/pc_compound2component.ttl.gz
07:54:22 INFO  loader          :: Add: 1,000,000 pc_compound2component.ttl.gz (Batch: 339,097 / Avg: 339,097)
07:54:28 INFO  loader          :: Add: 2,000,000 pc_compound2component.ttl.gz (Batch: 171,438 / Avg: 227,738)
07:54:38 INFO  loader          :: Add: 3,000,000 pc_compound2component.ttl.gz (Batch: 104,657 / Avg: 163,603)
07:54:48 INFO  loader          :: Add: 4,000,000 pc_compound2component.ttl.gz (Batch: 101,719 / Avg: 142,005)
07:54:58 INFO  loader          :: Add: 5,000,000 pc_compound2component.ttl.gz (Batch: 91,996 / Avg: 128,080)
07:55:10 INFO  loader          :: Add: 6,000,000 pc_compound2component.ttl.gz (Batch: 85,251 / Avg: 118,184)
07:55:22 INFO  loader          :: Add: 7,000,000 pc_compound2component.ttl.gz (Batch: 84,709 / Avg: 111,869)
07:55:32 INFO  loader          :: Add: 8,000,000 pc_compound2component.ttl.gz (Batch: 98,911 / Avg: 110,067)
07:55:43 INFO  loader          :: Add: 9,000,000 pc_compound2component.ttl.gz (Batch: 90,009 / Avg: 107,407)
07:55:55 INFO  loader          :: Add: 10,000,000 pc_compound2component.ttl.gz (Batch: 86,550 / Avg: 104,880)
07:55:55 INFO  loader          ::   Elapsed: 95.35 seconds [2026/01/20 07:55:55 UTC]
07:56:05 INFO  loader          :: Add: 11,000,000 pc_compound2component.ttl.gz (Batch: 96,413 / Avg: 104,049)
07:56:15 INFO  loader          :: Add: 12,000,000 pc_compound2component.ttl.gz (Batch: 103,241 / Avg: 103,981)
07:56:24 INFO  loader          :: Add: 13,000,000 pc_compound2component.ttl.gz (Batch: 109,301 / Avg: 104,372)
07:56:33 INFO  loader          :: Add: 14,000,000 pc_compound2component.ttl.gz (Batch: 110,132 / Avg: 104,763)
07:56:41 INFO  loader          :: Add: 15,000,000 pc_compound2component.ttl.gz (Batch: 118,245 / Avg: 105,566)
07:56:50 INFO  loader          :: Add: 16,000,000 pc_compound2component.ttl.gz (Batch: 116,604 / Avg: 106,194)
07:56:58 INFO  loader          :: Add: 17,000,000 pc_compound2component.ttl.gz (Batch: 122,070 / Avg: 107,013)
07:57:07 INFO  loader          :: Add: 18,000,000 pc_compound2component.ttl.gz (Batch: 120,279 / Avg: 107,672)
07:57:15 INFO  loader          :: Add: 19,000,000 pc_compound2component.ttl.gz (Batch: 123,046 / Avg: 108,385)
07:57:23 INFO  loader          :: Add: 20,000,000 pc_compound2component.ttl.gz (Batch: 120,729 / Avg: 108,942)
07:57:23 INFO  loader          ::   Elapsed: 183.58 seconds [2026/01/20 07:57:23 UTC]
07:57:31 INFO  loader          :: Add: 21,000,000 pc_compound2component.ttl.gz (Batch: 118,891 / Avg: 109,378)
07:57:39 INFO  loader          :: Add: 22,000,000 pc_compound2component.ttl.gz (Batch: 123,777 / Avg: 109,959)
07:57:48 INFO  loader          :: Add: 23,000,000 pc_compound2component.ttl.gz (Batch: 122,204 / Avg: 110,440)
07:57:55 INFO  loader          :: Add: 24,000,000 pc_compound2component.ttl.gz (Batch: 130,633 / Avg: 111,156)
07:58:05 INFO  loader          :: Add: 25,000,000 pc_compound2component.ttl.gz (Batch: 108,201 / Avg: 111,035)
07:58:13 INFO  loader          :: Add: 26,000,000 pc_compound2component.ttl.gz (Batch: 113,778 / Avg: 111,138)
07:58:23 INFO  loader          :: Add: 27,000,000 pc_compound2component.ttl.gz (Batch: 107,781 / Avg: 111,010)
07:58:29 INFO  loader          :: Finished: /data/source/compound/general/pc_compound2component.ttl.gz: 27,762,031 tuples in 249.16s (Avg: 111,421)
07:58:44 INFO  loader          :: Finish - index SPOG
07:58:44 INFO  loader          :: Finish - index GSPO
07:58:53 INFO  loader          :: Finish - index GOSP
07:58:54 INFO  loader          :: Finish - index GPOS
07:58:54 INFO  loader          :: Finish - index POSG
07:58:54 INFO  loader          :: Finish - index OSPG
07:58:54 INFO  loader          :: Time = 274.513 seconds : Quads = 27,762,031 : Rate = 101,132 /s
2026-01-20T07:58:55+00:00

The performance degradation becomes noticeable when the database size reaches around 300–400 GB.
I see high disk usage in this case:

afs · 2026-02-01T20:51:39Z

afs
Feb 1, 2026
Collaborator

I don't have access to a hardware setup that is similar to yours. I was using a local NVMe connected SSD. It does seem to be related to I/O load.

tdb2.xloader is less I/O intensive. But it loads all the data at once and does not incrementally load a database. This loader has been used to load over 10B on a laptop (Dell XPS, local NVMe SSD, only two channels). It does make the laptop a bit hot!

So the other choice is a database like oxigraph which uses RocksDB (I checked with the project and it does do incremental loads). RocksDB is more write-oriented.

Sorry for the non-answer but without hardware to recreate I can only speculate.

0 replies

maxx-ukoo · 2026-02-02T10:40:51Z

maxx-ukoo
Feb 2, 2026
Author

@afs Thanks for the update. After a few extra tests, I think I found the issue — I need a lot of memory for the upload. I ran everything from scratch, starting with a 16 GB VM and a 4 GB Java heap. After the upload speed decreased, I increased the memory to 32 GB, then to 64 GB. Each memory upgrade improved the upload speed. However, when I reduced the memory from 64 GB to 32 GB, the upload speed decreased again. I believe memory is responsible for the upload speed.
I didn’t use tdb2.xloader because it can’t upload data to the named graph. Also, I don’t know how to provide 400–600 files as command-line arguments for the upload.

1 reply

afs Feb 3, 2026
Collaborator

You'd need to prepare a quads file for tdb2.xloader. (PubChem happens to be parse-clean, which is not a given for many large datasets. In general, checking the data is good before loading is often a good idea.)

For the memory, yes, that makes sense. It's memory mapped file caching by the OS.

TDB databases are portable - you can build on one machine and move it to another. For example, get a large memory machine, maybe with local SSD, build the database and then copy it. The database are big - copying is not instant. Caveat - if copying them around, preserve sparse files e.g. rsync --sparse.

How to load big dataset to new database #3701

Uh oh!

maxx-ukoo Jan 15, 2026

Replies: 8 comments · 4 replies

Uh oh!

afs Jan 15, 2026 Collaborator

Uh oh!

maxx-ukoo Jan 15, 2026 Author

Uh oh!

afs Jan 15, 2026 Collaborator

Uh oh!

afs Jan 17, 2026 Collaborator

Uh oh!

afs Jan 17, 2026 Collaborator

Uh oh!

afs Jan 18, 2026 Collaborator

Uh oh!

maxx-ukoo Jan 18, 2026 Author

Uh oh!

afs Jan 19, 2026 Collaborator

Uh oh!

maxx-ukoo Jan 20, 2026 Author

Uh oh!

afs Feb 1, 2026 Collaborator

Uh oh!

maxx-ukoo Feb 2, 2026 Author

Uh oh!

afs Feb 3, 2026 Collaborator

maxx-ukoo
Jan 15, 2026

Replies: 8 comments 4 replies

afs
Jan 15, 2026
Collaborator

maxx-ukoo
Jan 15, 2026
Author

afs
Jan 15, 2026
Collaborator

afs
Jan 17, 2026
Collaborator

afs
Jan 17, 2026
Collaborator

afs Jan 18, 2026
Collaborator

maxx-ukoo Jan 18, 2026
Author

afs
Jan 19, 2026
Collaborator

maxx-ukoo Jan 20, 2026
Author

afs
Feb 1, 2026
Collaborator

maxx-ukoo
Feb 2, 2026
Author

afs Feb 3, 2026
Collaborator