Skip to content

Commit 9d4613b

Browse files
committed
Added streaming functionality for s3-based files! Now can be implemented with apps in Django and Flask
1 parent 95dcace commit 9d4613b

1 file changed

Lines changed: 5 additions & 20 deletions

File tree

README.md

Lines changed: 5 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -15,37 +15,22 @@ This module is designed to run as simply as possible. Just provide the file loc
1515

1616
```
1717
from TextSpitter import TexSpitter as TS
18-
import sqlite3
19-
20-
2118
folder_loc = 'foo/bar/'
2219
23-
# doc_file = folder_loc + 'file_thing.doc'
2420
docx_file = folder_loc + 'file_thing.docx'
2521
pdf_file = folder_loc + 'file_thing.pdf'
2622
text_file = folder_loc + 'file_thing.txt'
2723
2824
doc_tup = (docx_file, pdf_file, text_file)
29-
# doc_tup = (doc_file, docx_file, pdf_file, text_file)
30-
31-
# SQL code to write to database
32-
conn = sqlite3.connect('example_db')
33-
c= conn.cursor()
34-
35-
STMNT = 'INSERT INTO doc_contents VALUE %s'
3625
37-
# For Loop code to insert doc content into db
38-
for ele in doc_tup:
39-
text = TS(ele)
40-
c.executemany(STMNT, text)
41-
print('Done! Wrote the following to db: %s', (text[:25]))
26+
raw_text_payload = [TS(ele) for ele in doc_tup]
27+
text = '\n'.join(raw_text_payload)
28+
return text
4229
```
4330

4431
## TO DOs ##
45-
* [x] push to github
46-
* [x] Remove .doc support due to legacy format's extensive proprietary reqs
47-
* [ ] spruce up documentation
48-
* [ ] solicit feedback
32+
* [x] spruce up documentation
33+
* [X] Add stream functionality for s3-based file reading
4934
* [ ] expand functionality to other file types
5035
* [ ] TDB
5136

0 commit comments

Comments
 (0)