Hi,
We are using below code to highlight word in pdf,
public String highlight(String inputPath, String outputPath, String searchWord1,javax.servlet.http.HttpServletResponse res ) { BufferedInputStream bis = null; BufferedOutputStream bos = null; final String searchWord= searchWord1;
// 1. Open the PDF file! File file; try { file = new File(inputPath); } catch(Exception e) { throw new RuntimeException(inputPath + " file access error.",e); } count = 0; Pattern pattern = Pattern.compile(searchWord, Pattern.CASE_INSENSITIVE); // 2. Iterating through the document pages... TextExtractor textExtractor = new TextExtractor(true, true); for(final Page page : file.getDocument().getPages()) { // 2.1. Extract the page text! Map<Rectangle2D,List<ITextString>> textStrings = textExtractor.extract(page); // 2.2. Find the text pattern matches! final Matcher matcher = pattern.matcher(TextExtractor.toString(textStrings)); // 2.3. Highlight the text pattern matches! textExtractor.filter(textStrings, new TextExtractor.IIntervalFilter() { public boolean hasNext() { if (matcher.find()) { count++; return true; } return false; } public Interval<Integer> next() { return new Interval<Integer>(matcher.start(), matcher.end()); } public void process(Interval<Integer> interval, ITextString match) { // Defining the highlight box of the text pattern match... List<Quad> highlightQuads = new ArrayList<Quad>(); { Rectangle2D textBox = null; for(TextChar textChar : match.getTextChars()) { Rectangle2D textCharBox = textChar.getBox(); if(textBox == null) {textBox = (Rectangle2D)textCharBox.clone();} else { if(textCharBox.getY() > textBox.getMaxY()) { highlightQuads.add(Quad.get(textBox)); textBox = (Rectangle2D)textCharBox.clone(); } else {textBox.add(textCharBox);} } } textBox.setRect(textBox.getX(), textBox.getY(), textBox.getWidth(), textBox.getHeight()+5); highlightQuads.add(Quad.get(textBox)); } TextMarkup temp = new TextMarkup(page, searchWord, MarkupTypeEnum.Highlight, highlightQuads); temp.setColor(new DeviceRGBColor((35.0/255.0), (35.0/255.0), (142.0/255.0))); //TextMarkup temp = new TextMarkup( temp.setVisible(true); // temp.setColor(new DeviceRGBColor((35.0/255.0), (35.0/255.0), (142.0/255.0))); //temp.setColor("white"); } public void remove() {throw new UnsupportedOperationException();} } ); } SerializationModeEnum serializationMode = SerializationModeEnum.Incremental; ServletOutputStream out =null; try { file.save(new java.io.File(outputPath), serializationMode); //ByteArrayOutputStream baos = new ByteArrayOutputStream(); file.close(); } However we are getting below error for some pdfs org.pdfclown.util.NotImplementedException: LZWDecode org.pdfclown.bytes.filters.Filter.get(Filter.java:74) org.pdfclown.objects.PdfStream.getBody(PdfStream.java:193) org.pdfclown.objects.PdfStream.getBody(PdfStream.java:155) org.pdfclown.documents.contents.Contents$ContentStream.moveNextStream(Contents.java:279) org.pdfclown.documents.contents.Contents$ContentStream.<init>(Contents.java:86) org.pdfclown.documents.contents.Contents.load(Contents.java:591) org.pdfclown.documents.contents.Contents.<init>(Contents.java:366) org.pdfclown.documents.contents.Contents.wrap(Contents.java:345) org.pdfclown.documents.Page.getContents(Page.java:571) org.pdfclown.documents.contents.ContentScanner.<init>(ContentScanner.java:1033) org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:297) com.att.bcpr.actions.MyPdfHighlighting.highlight(MyPdfHighlighting.java:92) com.att.bcpr.actions.MyPdfHighlighting.doGet(MyPdfHighlighting.java:59) javax.servlet.http.HttpServlet.service(HttpServlet.java:620) javax.servlet.http.HttpServlet.service(HttpServlet.java:727) org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) org.apache.struts2.dispatcher.ng.filter.StrutsPrepareAndExecuteFilter.doFilter(StrutsPrepareAndExecuteFilter.java:86) When we traced the log then came to know that error is with below line, Map<Rectangle2D,List<ITextString>> textStrings = textExtractor.extract(page); Please help me in this. Thanks
Please guide me on this
Log in to post a comment.
Hi,
We are using below code to highlight word in pdf,
public String highlight(String inputPath, String outputPath, String searchWord1,javax.servlet.http.HttpServletResponse res )
{
BufferedInputStream bis = null;
BufferedOutputStream bos = null;
final String searchWord= searchWord1;
Please guide me on this