Description: KbookOCR — an intelligent system for recognition documents (OCR system).
Input: Specify document that you want to recognize (djvu, pdf, img ) + Select language of input document.
Pages to proceed: Scanning can be conducted of the entire document or selected range.
Preview size: Here is a few options of preview (located on the left preview window): Native Small Output: Outgoing documents can be saved in txt format (specify the folder you want to save) or opened with OpenOffice.
Based on: CuneiForm
Note:The quality of an output file depends of input source quality and work of third-party OCR package.
GitHub: http://github.com/b0noI/KBookOCR
PS: If you enjoyed our program, do not forget to click "+" Or you can even donateLast changelog:
2.1 - better KDE integration, better UI, pre-build only for x32
2.0 - new major version of KBookOCR. All new: - new GUI, - new project system, - new integration with cuneiform system, - new scaner support system(KSane). It's more stable, faster than 1.x version
1.4.1 "” you can load last project and continue to working on it
1.4.0 "” book pages thumbnails for recognition, batch scanning option
1.3.1: preview of scaned pages, some GUI usability improvements
1.3: new GUI
1.2: output in rtf, html (layout support) scanner support (via scanimage) GUI changed
UPD3: try to compile ONLY with Qt >= 4.7 UPD2: src is here, enjoy UP rpm and binary tar.gz for all x32 dsit is here src will coming soon
Hi. I'd tried to install the 2.1 version in Chakra linux (arch) and i cudn't. In konsole i unzip de tar.gz but at run ./configure said that the directory not exist. What can i do to install it?
Thank you
(sorry by my english)
I am currently trying to package your program, but I did not succeed.
https://build.opensuse.org/package/live_build_log?arch=i586&package=kbookocr&project=home%3AMailaender&repository=openSUSE_12.1
/usr/lib/gcc/i586-suse-linux/4.6/../../../../i586-suse-linux/bin/ld: kbookocr.o: undefined reference to symbol 'KIcon::~KIcon()'
/usr/lib/gcc/i586-suse-linux/4.6/../../../../i586-suse-linux/bin/ld: note: 'KIcon::~KIcon()' is defined in DSO /usr/lib/libkdeui.so.5 so try adding it to the linker command line
/usr/lib/libkdeui.so.5: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
make: *** [KBookocr] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.eLtV7J (%build)
This is getting better and better.
Today I've 'ocred' a 104 pages pdf. Result is decent, but there are a few issues:
-even if I select document language, some characters are not recognized (—, not -)
-every line that ends with '-' splitting a word results in a broken paragraph.
-output has no format (bold, alignment, font size, margins).
Can this be fixed or is a cuneiform limitation?
This new version fail to compile on my Slackware 13.37.
viewadder.cpp:33:6: warning: unused parameter ‘doc’
viewadder.cpp:33:6: warning: unused parameter ‘n’
g++ -c -O2 -march=i486 -mtune=i686 -O2 -march=i486 -mtune=i686 -Wall -W -D_REENTRANT -DQT_NO_DEBUG -DQT_GUI_LIB -DQT_CORE_LIB -DQT_SHARED -I/usr/lib/qt/mkspecs/linux-g++ -I. -I/usr/lib/qt/include/QtCore -I/usr/lib/qt/include/QtGui -I/usr/lib/qt/include -I/usr/include/poppler/qt4 -I. -I. -o scanerdialog.o scanerdialog.cpp
scanerdialog.cpp:2:29: fatal error: ui_scanerdialog.h: File o directory non esistente
compilation terminated.
make: *** [scanerdialog.o] Errore 1
make: *** Attesa dei processi non terminati....
ocrthread.cpp: In member function ‘bool OCRThread::startOCR()’:
ocrthread.cpp:168:1: warning: no return statement in function returning non-void
ocrthread.cpp: In member function ‘QString OCRThread::getImgAt(int)’:
ocrthread.cpp:238:1: warning: control reaches end of non-void function
1. Why don't use as an optional or second OCR engine - tesseract ? It is actually the accuratest OCR engine for linux.
2. How about OCR-ing multilingual languages documents ?
3. How to OCR-ing a part of the image ?
Each of your questions is part of our RoadMap. For example in version 2.2 will now support the tesseract. As for the other two points then, unfortunately, still can not say exactly which version it will be implemented
Hello,
first i'd like to thank you for your efforts. When building KBookOCR for Mageia 1 x86_64 i noticed that there is a library hardcoded with path. See for yourself:
[doktor5000@mageia1 KBookocr]$ grep -R libksane.so ./
./KBookocr.pro:/usr/lib/libksane.so
./Makefile:LIBS = $(SUBLIBS) -L/usr/lib64 -L/usr/lib -lpoppler-qt4 /usr/lib/libksane.so -lQtGui -L/usr/lib64 -lQtCore -lpthread
This is no good and breaks build on x86_64. Please fix with next release.
The answer is partly in the name of the program. The program began as a way to work with books (pdf, djvu), so that other programs only work with images and scanner.
Next we plan to support multiple drivers (not just cuneiform). There will also be implemented many new features including for example: automatic language detection, improve the recognition of document structure, etc...
KBookOCR finally killed all analog Books)
Hi,
I am running Kubuntu 11.04 amd64. When installing package 'kbookocr_2.0.5.x64.deb' using 'gdebi-gtk' from the konsole ... everything goes OK until the end ... when I receive this screenshot ( http://farm6.static.flickr.com/5067/5879629016_f367c0f77c_z.jpg ). The terminal says nothing. But I can run the program, and things appear to go nice so it's a bit strange. I have been digging a bit and I don't know if this is ok (a screenshot from synaptic) ( http://farm7.static.flickr.com/6001/5879629020_6fde1285e7_b.jpg ) ... maybe it says something to you ... maybe this is not the problem.
On the other hand, I have found that cuneiform is in version '1.1.0+dfsg-1' at 'Oneiric Ocelot' ... and I was wondering if KBookOCR is ready for this version.
Congratulations for the application, it looks very promising !!!
Regards.
Hi,
I am running Kubuntu 11.04 amd64. When installing package 'kbookocr_2.0.5.x64.deb' using 'gdebi-gtk' from the konsole ... everything goes OK until the end ... when I receive this screenshot ( http://farm6.static.flickr.com/5067/5879629016_f367c0f77c_z.jpg ). The terminal says nothing. But I can run the program, and things appear to run nice so it's a bit strange. I have been digging a bit and I don't know if this is ok (a screenshot from synaptic) ( http://farm7.static.flickr.com/6001/5879629020_6fde1285e7_b.jpg ) ... maybe it says something to you ... maybe this is not the problem.
On the other hand, I have found that cuneiform is in version '1.1.0+dfsg-1' at 'Oneiric Ocelot' ... and I was wondering if KBookOCR is ready for this version.
Congratulations for the application, it looks very promising !!!
Regards.
if it is enough, I can put src where there will not be support scanner and will be no KSane. Ready (build) packets without KSane will't be in 2.0, maybe in 2.x (or even later :( ) we made version for Gnome (GBookOCR :) )
Thanks for KBookOCR. I made a package [0] for Pardus [1]. Pardus users can install KBookOCR via following commands:
$ sudo pisi it -c system.devel
$ sudo pisi bi https://svn.pardus.org.tr/pardus/playground/maidis/ocr/cuneiform/pspec.xml
$ sudo pisi it cuneiform*.pisi
$ sudo pisi bi https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/pspec.xml
$ sudo pisi it kbookocr*.pisi
I made also a desktop file [2] for adding a entry to KDE menu and a patch [3] for fixing compilation on Pardus 2011. Could you add these to KBookOCR, if theye are OK?
Do you plan to support other OCR systems (Tesseract, GOCR, Ocrad...)?
[0] https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/
[1] http://www.pardus.org.tr/eng/
[2] https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/files/kbookocr.desktop
[3] https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/files/add-kde4-include-dir.diff
Thank you for your work and creating a repo for KBookOCR.
Yes, we plan to support other engines for OCR.
For desktop file, it is present in the deb and rpm packages, but you certainly are right and I will add it in the src as well.
Ratings & Comments
56 Comments
10 10 the best
10 10 the best
Hi. I'd tried to install the 2.1 version in Chakra linux (arch) and i cudn't. In konsole i unzip de tar.gz but at run ./configure said that the directory not exist. What can i do to install it? Thank you (sorry by my english)
I`ve downloaded package "kbookocr 2.0 (amd64.deb)" and have a trouble with dependies - it requires libpoppler-qt4-3 but i have only libpoppler-qt4-4
I am currently trying to package your program, but I did not succeed. https://build.opensuse.org/package/live_build_log?arch=i586&package=kbookocr&project=home%3AMailaender&repository=openSUSE_12.1 /usr/lib/gcc/i586-suse-linux/4.6/../../../../i586-suse-linux/bin/ld: kbookocr.o: undefined reference to symbol 'KIcon::~KIcon()' /usr/lib/gcc/i586-suse-linux/4.6/../../../../i586-suse-linux/bin/ld: note: 'KIcon::~KIcon()' is defined in DSO /usr/lib/libkdeui.so.5 so try adding it to the linker command line /usr/lib/libkdeui.so.5: could not read symbols: Invalid operation collect2: ld returned 1 exit status make: *** [KBookocr] Error 1 error: Bad exit status from /var/tmp/rpm-tmp.eLtV7J (%build)
This is getting better and better. Today I've 'ocred' a 104 pages pdf. Result is decent, but there are a few issues: -even if I select document language, some characters are not recognized (—, not -) -every line that ends with '-' splitting a word results in a broken paragraph. -output has no format (bold, alignment, font size, margins). Can this be fixed or is a cuneiform limitation?
This new version fail to compile on my Slackware 13.37. viewadder.cpp:33:6: warning: unused parameter ‘doc’ viewadder.cpp:33:6: warning: unused parameter ‘n’ g++ -c -O2 -march=i486 -mtune=i686 -O2 -march=i486 -mtune=i686 -Wall -W -D_REENTRANT -DQT_NO_DEBUG -DQT_GUI_LIB -DQT_CORE_LIB -DQT_SHARED -I/usr/lib/qt/mkspecs/linux-g++ -I. -I/usr/lib/qt/include/QtCore -I/usr/lib/qt/include/QtGui -I/usr/lib/qt/include -I/usr/include/poppler/qt4 -I. -I. -o scanerdialog.o scanerdialog.cpp scanerdialog.cpp:2:29: fatal error: ui_scanerdialog.h: File o directory non esistente compilation terminated. make: *** [scanerdialog.o] Errore 1 make: *** Attesa dei processi non terminati.... ocrthread.cpp: In member function ‘bool OCRThread::startOCR()’: ocrthread.cpp:168:1: warning: no return statement in function returning non-void ocrthread.cpp: In member function ‘QString OCRThread::getImgAt(int)’: ocrthread.cpp:238:1: warning: control reaches end of non-void function
Same for me...
Sorry. Fixed. You can redownload now
Many thanks KBookocr compile and work fine now!
1. Why don't use as an optional or second OCR engine - tesseract ? It is actually the accuratest OCR engine for linux. 2. How about OCR-ing multilingual languages documents ? 3. How to OCR-ing a part of the image ?
Each of your questions is part of our RoadMap. For example in version 2.2 will now support the tesseract. As for the other two points then, unfortunately, still can not say exactly which version it will be implemented
Thanks. The Layout option is a good move but needs polishing. It arranges the output in different order then the original.
Hello, first i'd like to thank you for your efforts. When building KBookOCR for Mageia 1 x86_64 i noticed that there is a library hardcoded with path. See for yourself: [doktor5000@mageia1 KBookocr]$ grep -R libksane.so ./ ./KBookocr.pro:/usr/lib/libksane.so ./Makefile:LIBS = $(SUBLIBS) -L/usr/lib64 -L/usr/lib -lpoppler-qt4 /usr/lib/libksane.so -lQtGui -L/usr/lib64 -lQtCore -lpthread This is no good and breaks build on x86_64. Please fix with next release.
Thanks! We will fix it in 2.1
Looks pretty good. I wonder, however, what are the key points, making it different from YAGF?
The answer is partly in the name of the program. The program began as a way to work with books (pdf, djvu), so that other programs only work with images and scanner. Next we plan to support multiple drivers (not just cuneiform). There will also be implemented many new features including for example: automatic language detection, improve the recognition of document structure, etc... KBookOCR finally killed all analog Books)
Hi, I am running Kubuntu 11.04 amd64. When installing package 'kbookocr_2.0.5.x64.deb' using 'gdebi-gtk' from the konsole ... everything goes OK until the end ... when I receive this screenshot ( http://farm6.static.flickr.com/5067/5879629016_f367c0f77c_z.jpg ). The terminal says nothing. But I can run the program, and things appear to go nice so it's a bit strange. I have been digging a bit and I don't know if this is ok (a screenshot from synaptic) ( http://farm7.static.flickr.com/6001/5879629020_6fde1285e7_b.jpg ) ... maybe it says something to you ... maybe this is not the problem. On the other hand, I have found that cuneiform is in version '1.1.0+dfsg-1' at 'Oneiric Ocelot' ... and I was wondering if KBookOCR is ready for this version. Congratulations for the application, it looks very promising !!! Regards.
it really was a small error in the dependencies. it will not be in the next version 2.1
Hi, I am running Kubuntu 11.04 amd64. When installing package 'kbookocr_2.0.5.x64.deb' using 'gdebi-gtk' from the konsole ... everything goes OK until the end ... when I receive this screenshot ( http://farm6.static.flickr.com/5067/5879629016_f367c0f77c_z.jpg ). The terminal says nothing. But I can run the program, and things appear to run nice so it's a bit strange. I have been digging a bit and I don't know if this is ok (a screenshot from synaptic) ( http://farm7.static.flickr.com/6001/5879629020_6fde1285e7_b.jpg ) ... maybe it says something to you ... maybe this is not the problem. On the other hand, I have found that cuneiform is in version '1.1.0+dfsg-1' at 'Oneiric Ocelot' ... and I was wondering if KBookOCR is ready for this version. Congratulations for the application, it looks very promising !!! Regards.
Is it possible to build this without KSane support?
if it is enough, I can put src where there will not be support scanner and will be no KSane. Ready (build) packets without KSane will't be in 2.0, maybe in 2.x (or even later :( ) we made version for Gnome (GBookOCR :) )
Thanks for KBookOCR. I made a package [0] for Pardus [1]. Pardus users can install KBookOCR via following commands: $ sudo pisi it -c system.devel $ sudo pisi bi https://svn.pardus.org.tr/pardus/playground/maidis/ocr/cuneiform/pspec.xml $ sudo pisi it cuneiform*.pisi $ sudo pisi bi https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/pspec.xml $ sudo pisi it kbookocr*.pisi I made also a desktop file [2] for adding a entry to KDE menu and a patch [3] for fixing compilation on Pardus 2011. Could you add these to KBookOCR, if theye are OK? Do you plan to support other OCR systems (Tesseract, GOCR, Ocrad...)? [0] https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/ [1] http://www.pardus.org.tr/eng/ [2] https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/files/kbookocr.desktop [3] https://svn.pardus.org.tr/pardus/playground/maidis/ocr/kbookocr/files/add-kde4-include-dir.diff
Thank you for your work and creating a repo for KBookOCR. Yes, we plan to support other engines for OCR. For desktop file, it is present in the deb and rpm packages, but you certainly are right and I will add it in the src as well.
Any chance of working with tesseract too in a future?