Skip to content

Commit 67f86a2

Browse files
committed
hotfix: initializing FileExtractor with file_obj
1 parent f50f3e6 commit 67f86a2

3 files changed

Lines changed: 4 additions & 4 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ I created this little app to help me process documents from folder sets and batc
44

55
This is my first python module, so I hope I did this well!
66

7-
## Installation ##
7+
## Installation ##
88
* Type `pip install TextSpitter`
99
* **OPTIONAL** type `pip install PyMuPDF` to install the Python-MuPDF engine for better fidelity with text extraction (i.e.: maintaining correct White Spacing)
1010
* You will need to follow instructions to ensure that PyMuPDF's dependencies install to your system. There are wheels and binaries available for Windows, Linux, and MacOSX, though if you're on something weird like NetBSD/FreeBSD/specialty linux distros, you may e SOL. Fortunately, CLI options like Yum, Pkgin, Apt-Get and so forth will have packages available straight from the terminal.
@@ -23,7 +23,7 @@ text_file = folder_loc + 'file_thing.txt'
2323
2424
doc_tup = (docx_file, pdf_file, text_file)
2525
26-
raw_text_payload = [TS(ele) for ele in doc_tup]
26+
raw_text_payload = [TS(filename=ele) for ele in doc_tup]
2727
text = '\n'.join(raw_text_payload)
2828
return text
2929
```

TextSpitter/core.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ def __init__(
2222
self.file_ext = filename.split(".")[-1]
2323
else:
2424
if hasattr(file_obj, "name"):
25-
self.file = file_obj.name
25+
self.file = file_obj
2626
self.file_ext = file_obj.name.split(".")[-1]
2727
else:
2828
raise Exception(

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
setuptools.setup(
77
name="TextSpitter",
8-
version="0.3.5a3",
8+
version="0.3.5a4",
99
author="Francis Secada",
1010
author_email="francis.secada@gmail.com",
1111
description="Python package that spits out text from your document files!",

0 commit comments

Comments
 (0)