profile
viewpoint
Joachim Garth jgarth Tiny Empire Berlin, Germany tinyempire.de

jgarth/MCResource 11

A library to interact with a RESTful Rails backend in Cappuccino

jgarth/runcoderun 5

Run any shell script in an atom tab.

jgarth/clearance_http_auth 2

Simple, instant HTTP Basic Authentication for applications using Clearance

jgarth/herstory 1

Records changes to ActiveRecord models

jgarth/NSAttributedString-DDHTML 1

Simplifies working with NSAttributedString by allowing you to use HTML to describe formatting behaviors.

jgarth/AlamofireObjectMapper 0

An Alamofire extension which converts JSON response data into swift objects using ObjectMapper

issue commentgettalong/hexapdf

Please resolve conflicting statements about license.

Thanks for bringing this up! Could you clarify where you found the three points below the quoted text?

Generally, if you don't want to adhere to the AGPL or can't adhere to it (e.g. because you can't make the code that uses HexaPDF available under AGPL), then you need a commercial license.

So the easiest way to think about this is: Can I do what I need using the AGPL licensed version? If yes, then you don't need a commercial license.

In your case I guess that you probably need a commercial license but please contact someone with legal expertise.

brandondrew

comment created time in 3 days

issue openedgettalong/hexapdf

Please resolve conflicting statements about license.

There are several conflicting criteria given on the web site regarding when a paid license is required.

The AGPL puts some restrictions in place to make sure that the community benefits from changes to the code. For example, if you use HexaPDF in an application and distribute that application, you have to make the source code of the whole application available under the AGPL. The same applies if your application is used over a network.

If you want to use HexaPDF in a commercial setting, you need to purchase a commercial license.

  1. When there are changes made to the HexaPDF code.
  2. When the HexaPDF code is added to an app (such as when used as a gem—which does not involve changing HexaPDF at all) that is used over a network—therefore any Rails app ever.
  3. When it is used in a commercial setting.

In my case I would use it in a Rails app for a university, probably not making any changes to the HexaPDF code. So only criteria number 2 would apply.

I would never criticize your choice to require a paid license for your hard work: you have a right to earn money from it. But if # 2 is sufficient to require a paid license, then why are 1 and 3 even mentioned? Please clarify. And if # 2 is not sufficient, can it be removed?

created time in 3 days

created taggettalong/hexapdf

tagREL_0_13_0

Versatile PDF creation and manipulation for Ruby

created time in 8 days

push eventgettalong/hexapdf

Thomas Leitner

commit sha 2790ea25e5c5c05bef7dc1ba5c2c84f0a01487ab

Reconstruct cross reference table for damaged PDFs Some PDF files are damaged in a way that preserves their content but still makes them invalid. This happens, for example, when content is added to the front or end of a file. Most of these problems can be overcome easily be being more relaxed with parsing, e.g. searching for the PDF marker in the first 1024 bytes. Some of these errors are more serious because there is some invalid information, e.g. wrong offsets in the cross reference table. Such an error can still be recovered from by serially parsing the file front to back and storing the found objects in a reconstructed cross refernce table. This commit implements such an algorithm and allows HexaPDF to read nearly all PDF files. There are two caveats for this: 1. The damaged file must be written in a way that allows such a recovery. 2. The reconstructed file may, in rare cases, not represent the original file correctly because some information is interpreted wrongly.

view details

Thomas Leitner

commit sha a82d647e6ec67ee24fa061bff5ffe691c0c1104d

Rearrange tests in parser test suite

view details

Thomas Leitner

commit sha d3c7fb5446da42a8bc8610992a613126a210cda6

Fix reporting of cross-reference section entry parsing error If a cross-reference section entry is not parsable, the error is currently reported as recoverable. Fix this by making it non-recoverable.

view details

Thomas Leitner

commit sha 9470cd29628801b7730f67cb198df270b0b75a42

Use PDF version 1.0 for dictionary fields defining no version requirement

view details

Thomas Leitner

commit sha 7ce8cd59516ac68ff806685b1ddd10c9aad5af41

Change TrueType font validation to ignore missing fields for standard fonts The standard 14 PDF fonts need to be Type1 fonts. However, there are PDF files out in the wild which specify the standard 14 PDF fonts as TrueType fonts and leave out the required fields. Since it seems that PDF viewers just ignore the missing fields, we do the same.

view details

Thomas Leitner

commit sha e605993f303ebc5e19f78e8c95e964f685541173

Remove unnessary checking whether value is duplicatable Instances of some classes were not duplicatable until Ruby 2.4. This includes numbers, true, false, nil and symbols. Since these restrictions are not valid for Ruby 2.4 or newer, we can remove the code that checks for these classes before duplicating an object.

view details

Thomas Leitner

commit sha ca497d4f5061ffe5f6d36ca1ff447aaedcc4a1ab

Don't rely on validation for a correct Type::Resources object When such an object is created through a convenience method, make sure that it is already valid. This ensure correct behaviour even if validation is never done.

view details

Thomas Leitner

commit sha 82404c26cc4913f3f582f61e8799e25447e3672d

Make xref parse error messages more concise

view details

Thomas Leitner

commit sha 7f9f273b0488f518f4377a15e464e6d9c2614f7a

Update document/object validation code The validation methods Object#validate and Document#validate do nearly the same, yet yield different arguments to the provided block. Furthermore, only the first validation problem can be reported. This changes the internals so that regardless of which method is invoked, the same kind of arguments are always yielded. Furthermore, the internals of the Object#validate and \#perform_validation methods has been changed so that multiple validation problems may be yielded. Since the interfaces and internal workings have been changed, this is also a breaking change.

view details

Thomas Leitner

commit sha ebd762b0fbcaf904fb44cc36a83cbdc230cf8f34

Update rubocop to 1.0 and fix issues

view details

Thomas Leitner

commit sha 6f876e1c80304ba822bf3be9f2e838f5f5762ad0

Allow lowercase page size names with image2pdf command

view details

Thomas Leitner

commit sha adfaa67fd210e317db6221eb65fd6bfd9171f856

Fix URL in API documentation

view details

Thomas Leitner

commit sha 3b0d6699272afcfe566313de8f047f582b0bd6f8

Fix error when parsing an invalid object number in inspect command

view details

Thomas Leitner

commit sha 4e95a369ac8a82a2d2dac0df948c529e4b0d22f8

Provide better error messages for hexapdf inspect commands

view details

Thomas Leitner

commit sha 98a0e0722cc1a7fda2ebdf2ea9fb9bfa3bb7d2f1

Fix typo in manual page

view details

Thomas Leitner

commit sha d10beb9956267ab5a65d089f3dbbb7b01e48cc08

Add hexapdf inspect commands for showing page object/content streams by page number

view details

Thomas Leitner

commit sha a1ff6706785b7263f7eaf9952cfeae159fc3b9fa

Output error messages in hexapdf inspect command to stderr

view details

Thomas Leitner

commit sha 5fa5b11db7004669cbdfd4da476f4ad4f932cf04

Update API documentation of Document with info on dispatched messages

view details

Thomas Leitner

commit sha fc2a25ba90df3f19b8af9ecde12594fc3bd0a033

Fix warning on Ruby 2.7

view details

Thomas Leitner

commit sha 961a32676e5a4ca30133514e498d571e99a6f682

Add --check flag to CLI command info The --check flag allows checking a file for parse and validation errors.

view details

push time in 8 days

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

👏 👏 👏 👏

efatsi

comment created time in 10 days

issue closedgettalong/hexapdf

Content processor show_text NoMethodError for NilClass

(this is the same context as the colorspace issue we dealt with, in my redactor ) The place it is failing on the line noted below. The log lines saying 'show_text_string' have the value of str. It is when the string appears to be a single dash '-'.

rails_1_35fba6963d05 | [DEBUG] [fd6955] REDACT: show_text string  1RWH__WKLV_LV_IOLSSHG_IURP_RXU_GLVFXVVLRQ_LQ_WRGD\
rails_1_35fba6963d05 | [DEBUG] [fd6955]  V_PHHWLQJ___,W_QHHGV_WR_EH_WKLV_ZD\_WR_PDNH_WKH_WLPHOLQH_ZRUN_DQG_VR_SHRSOH_JH
W_WR nil? false blank? false
rails_1_35fba6963d05 | [DEBUG] [fd6955] REDACT: show_text string  NQRZ_ZKDW_FRPPLWWHHV_WKH\_DUH_PHPEHUV_RI_EHIRUH_WKRVH
_FRPPLWWHHV_PHHW_ nil? false blank? false
rails_1_35fba6963d05 | [DEBUG] [fd6955] REDACT: show_text string  - nil? false blank? false
rails_1_35fba6963d05 | [INFO ] [fd6955] method=GET  path=/story/
ontroller=story/articles action=view status=500  error='NoMethodError:undefined method `[]' for nil:NilClass'  duration=
881.97 db=0.00 uid=1  full_url=http://t
rails_1_35fba6963d05 | [FATAL] [fd6955]
rails_1_35fba6963d05 | [FATAL] [fd6955] NoMethodError (undefined  method `[]' for nil:NilClass):
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:48:in `show_text'
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:135:in `block in redact'
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:130:in `with_index'
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:130:in `redact'

hexrep.rb


class Hexrep < HexaPDF::Content::Processor

  def initialize(page)
    super()
    @canvas = page.canvas(type: :overlay)
    @boxeslist = []
    @emails = []
    @strings = []
  end


  def show_text(str)

    begin
      strings = decode_text(str)
      @strings << strings
      boxes = decode_text_with_positioning(str) # Exception happens on this line, line 48
      boxes.each do |box|
        @boxeslist << box
      end
    rescue NoMethodError => ex
      Rails.logger.warn "PDF redact failure on string #{str}"
    end

  end

 #.........

closed time in 10 days

joshco

issue commentgettalong/hexapdf

Content processor show_text NoMethodError for NilClass

@joshco Since I didn't get any feedback on this problem, i can't investigate it. Therefore I will close this issue for now. If you have any further information, please add a comment and I will re-open the issue.

joshco

comment created time in 10 days

issue closedgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

I'm getting the following error when trying to retrieve the pages of a local PDF:

HexaPDF::MalformedPDFError: PDF malformed around position : The oid,gen (43,0) values of the indirect object don't match the values (44,0) from the xref

The code:

pdf = HexaPDF::Document.open("/Users/efatsi/Desktop/Ratified.pdf")
pdf.pages.count # this line triggers the error.

The PDF appears mostly normal (desktop icon is slightly different), and opens well enough in other tools, so I'm not sure what this error means. The only other clue I have is that if I open the PDF in Preview, and then export it as a PDF, then that version works. Meaning OSX Preview is able to squash whatever formatting issue there is.

Curious if you had any thoughts as to what might be going on here. Unfortunately I can't share the culprit PDF as it contains sensitive information, but I could try to track down a similarly messed up one if that would help diagnose.

closed time in 10 days

efatsi

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@matpowel I will close this issue since the original problem is solved with the next release. If you can try out the scripts in https://github.com/gettalong/hexapdf/issues/51#issuecomment-724119150 with your PDF file and one of them fails, please open a new issue - thank you!

efatsi

comment created time in 10 days

issue closedgettalong/hexapdf

Strange behavior

Firstly, I love this project. Pure Ruby PDF manipulation, clean looking APIs. Super excited, well done everyone.

The big BUT for me is that the couple of PDFs I've tried it on yield super unexpected results, check it out:

``

hp = HexaPDF::Document.open('in.pdf') => ... hp.pages.size => 4 hp.pages[0] => #<HexaPDF::Type::Page ...> hp.pages.delete_at(0) => #<HexaPDF::Type::Page ...> hp.write('out.pdf') Traceback: ... HexaPDF::Error (Validation error for (8,0): Required field Parent is not set) ``

Also for a 4 page PDF, using the CLI when trying to split a PDF has the following weird behavior:

  • hexapdf modify --force in.pdf -i 2,3 out.pdf
    • produces a 2 page PDF (correct) but both pages are totally blank
  • hexapdf split in.pdf
    • correctly splits the PDF into 4 single page PDFs!

I can't post the PDF I tried because it's a client bill but I believe all 4 pages are images.

UPDATE: I PDF'd the wsj.com homepage so I'd have a vector PDF with a combination of images and text and I got the same outcome using the Ruby console but this time the CLI modify tool worked as expected.

What is the status of this project? Is it anywhere near production ready? I'm not sure how much time I can apply to helping right now but if our team adopts this project we can try set some time aside to helping.

Thanks.

closed time in 10 days

matpowel

issue commentgettalong/hexapdf

Strange behavior

@matpowel FYI I have just implemented the page moving method - see https://github.com/gettalong/hexapdf/commit/38b62bf12db5757f72091770fc2352762598e51f

Since I can't really do anything with respect to the modify/split problem, I will close this issue for now. If you have some more information that could help with this, I will re-open it for further investigation.

matpowel

comment created time in 10 days

push eventgettalong/hexapdf

Thomas Leitner

commit sha a0b87347ef00d8cbbbb2194354b66c151069bd4e

Reconstruct cross reference table for damaged PDFs Some PDF files are damaged in a way that preserves their content but still makes them invalid. This happens, for example, when content is added to the front or end of a file. Most of these problems can be overcome easily be being more relaxed with parsing, e.g. searching for the PDF marker in the first 1024 bytes. Some of these errors are more serious because there is some invalid information, e.g. wrong offsets in the cross reference table. Such an error can still be recovered from by serially parsing the file front to back and storing the found objects in a reconstructed cross refernce table. This commit implements such an algorithm and allows HexaPDF to read nearly all PDF files. There are two caveats for this: 1. The damaged file must be written in a way that allows such a recovery. 2. The reconstructed file may, in rare cases, not represent the original file correctly because some information is interpreted wrongly.

view details

Thomas Leitner

commit sha 1475a9d6c7f0514c132acc855e1f43d3639abf11

Rearrange tests in parser test suite

view details

Thomas Leitner

commit sha b499f25a300378cb47c2dc6000847aeb0d352f5f

Fix reporting of cross-reference section entry parsing error If a cross-reference section entry is not parsable, the error is currently reported as recoverable. Fix this by making it non-recoverable.

view details

Thomas Leitner

commit sha 387b125c87fb58ec764cfa2c4202e125a60c5818

Use PDF version 1.0 for dictionary fields defining no version requirement

view details

Thomas Leitner

commit sha d6cd38cedb06259766a1f8ca19682bc3b813fd23

Change TrueType font validation to ignore missing fields for standard fonts The standard 14 PDF fonts need to be Type1 fonts. However, there are PDF files out in the wild which specify the standard 14 PDF fonts as TrueType fonts and leave out the required fields. Since it seems that PDF viewers just ignore the missing fields, we do the same.

view details

Thomas Leitner

commit sha 8ed3d4622b8a9500755953cd4ba67a9b6823aba0

Remove unnessary checking whether value is duplicatable Instances of some classes were not duplicatable until Ruby 2.4. This includes numbers, true, false, nil and symbols. Since these restrictions are not valid for Ruby 2.4 or newer, we can remove the code that checks for these classes before duplicating an object.

view details

Thomas Leitner

commit sha 7ff82b7b33db5b111fe8e6d19666ac063536ab59

Don't rely on validation for a correct Type::Resources object When such an object is created through a convenience method, make sure that it is already valid. This ensure correct behaviour even if validation is never done.

view details

Thomas Leitner

commit sha 1a80fbb32a8143d62e792850e555a45933189797

Make xref parse error messages more concise

view details

Thomas Leitner

commit sha 615bd8f9f45c7cfd0c1ec4b5b5be2128c1e850da

Update document/object validation code The validation methods Object#validate and Document#validate do nearly the same, yet yield different arguments to the provided block. Furthermore, only the first validation problem can be reported. This changes the internals so that regardless of which method is invoked, the same kind of arguments are always yielded. Furthermore, the internals of the Object#validate and \#perform_validation methods has been changed so that multiple validation problems may be yielded. Since the interfaces and internal workings have been changed, this is also a breaking change.

view details

Thomas Leitner

commit sha 3076969af083b6deff74d5a7d4c596bf7c4c70ff

Update rubocop to 1.0 and fix issues

view details

Thomas Leitner

commit sha 0349810c322b7a4f160001727f016e297aafa0ef

Allow lowercase page size names with image2pdf command

view details

Thomas Leitner

commit sha 2a41e36f5f3228bb1988b93838d3d06584277096

Fix URL in API documentation

view details

Thomas Leitner

commit sha 19d0c8eabe111a1b2b55e520b6fa978cd9e950f3

Fix error when parsing an invalid object number in inspect command

view details

Thomas Leitner

commit sha e17edefd58455f592813ef6f7764c62ec55b8d13

Provide better error messages for hexapdf inspect commands

view details

Thomas Leitner

commit sha d1da223e5e240d82fccc651cfeef1158c81bc788

Fix typo in manual page

view details

Thomas Leitner

commit sha cf9b8117f27037a490bf8e325099ae5e864517fb

Add hexapdf inspect commands for showing page object/content streams by page number

view details

Thomas Leitner

commit sha 8de76e800ea53c648307e6b402b8dbf46d843031

Output error messages in hexapdf inspect command to stderr

view details

Thomas Leitner

commit sha 37fd215dc789bffee904145abd7c2465a5f17272

Update API documentation of Document with info on dispatched messages

view details

Thomas Leitner

commit sha 46af1d4a0865a234f1b88f37b0224c08c260812a

Fix warning on Ruby 2.7

view details

Thomas Leitner

commit sha 95bb6b4b841af16441f6b547a983d5b9ebb3008a

Add --check flag to CLI command info The --check flag allows checking a file for parse and validation errors.

view details

push time in 11 days

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@matpowel Could you try the following example scripts with your PDF file and the latest commit from the devel branch:

Prints the number of revisions and does a simple open/write:

require 'hexapdf'
HexaPDF::Document.open(ARGV[0]) do |doc|
  puts doc.revisions.size
  doc.write('test1.pdf', validate: true)
end

Simple open/write but without validations:

require 'hexapdf'
HexaPDF::Document.open(ARGV[0]) do |doc|
  doc.write('test2.pdf', validate: false)
end

Validating the document but only the current revision:

require 'hexapdf'
HexaPDF::Document.open(ARGV[0]) do |doc|
  each(only_current: true, only_loaded: false) do |obj|
    obj.validate(auto_correct: true) do |obj, msg, correctable|
      next if correctable
      raise HexaPDF::Error, "Validation error for (#{obj.oid},#{obj.gen}): #{msg}"
    end
  end
end

The problem might be that the PDF file contains multiple revisions and a prior revision is either invalid or the validation routine doesn't handle PDF with multiple revisions correctly.

efatsi

comment created time in 15 days

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@matpowel If opening in Mac Preview works, I should be able to find the problem and make it work in HexaPDF, given you can provide the PDF file.

efatsi

comment created time in a month

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@gettalong the error comes when trying a simple open and write without any modification. If we open the PDF in Mac Preview and re-export to PDF then it works fine. Previously we added code to try opening the doc and rescue the Malformed error when first importing the doc, now we added code to try writing to a tempfile and rescuing the Error. In both cases we send it off to ConvertAPI.com for reprocessing (we also use that to convert from Word etc) and then the PDF processes just fine.

efatsi

comment created time in a month

issue commentgettalong/hexapdf

Strange behavior

@matpowel No problem - thanks for the feedback! I will add "move page method" to my TODO list ;)

As for sensitive PDFs: If it would help, I could sign a confidentiality agreement concerning the PDFs.

matpowel

comment created time in a month

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@matpowel No, I don't think so This issue is about parse errors due to corrupted documents. The PDF you have seems to be parsable because you can actually do something with it. However, there seems to be a structural problem with it, the Parent key is used e.g. by the PDF internal page tree structure. There are already mechanisms in place to repair invalid page trees during validation, so this may be something that is currently not caught. Can you run hexapdf inspect YOUR.PDF p on that file without error? Can you open and view that file in a PDF reader application?

@pashamur Thanks nonetheless!

efatsi

comment created time in a month

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@gettalong Unfortunately I'm not allowed to share the PDFs I was testing on :( (I ended up using PDFParser/poppler utils for my use cases)

efatsi

comment created time in a month

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@gettalong is this at all related to another error we get regularly like this? It seems to also be a corrupt PDF.

Validation error for (33,0): Required field Parent is not set

Unfortunately, I've only seen it happen on sensitive financial information so I can't send you the PDF :(

For now, for both of these errors we have to catch the exceptions and send the document off for re-conversion back to PDF, but it would awesome if HexaPDF could handle such things if at all possible.

efatsi

comment created time in a month

issue commentgettalong/hexapdf

Strange behavior

@gettalong sorry I didn't reply earlier, we had to focus on a couple of other major areas of the code.

What's funny about this is that we ended up using the lack of document deletion in our page move method. We've now modified it per below, but I still think the new idiom is much nicer. Other parts of our code are cleaner and more intuitive when a delete is what is wanted.

image

Unfortunately it's very hard for me to ever send you the test documents because we deal with customer financial documents and they are almost always sensitive. I'll hopefully get one that I can send at some point.

matpowel

comment created time in a month

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@pashamur Could you provide this PDF so that I can investigate the problem?

efatsi

comment created time in 2 months

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

I tried out the devel branch against one such corrupted PDF and got Problem encountered: undefined methodoid' for nil:NilClass`

efatsi

comment created time in 2 months

issue commentgettalong/hexapdf

Content processor show_text NoMethodError for NilClass

Can you share the PDF that produces the error? If it is confidential, you can send it to support@gettalong.at.

If you can't share it: Can you produce the actual line of the error? I.e. the line inside the HexaPDF library that throws this error?

joshco

comment created time in 2 months

PublicEvent

push eventcgarth/nfdi4plants-site

Christoph Garth

commit sha e4dde76d6f60c29302e4954f00e9c6289a948be6

add erroneously missing bootstrap vendor component

view details

Christoph Garth

commit sha 99e84a0d6a72910ea15f360d4caa5f4858429a0c

add github-pages gem (required downgrade jekyll to 3.9.0)

view details

push time in 2 months

issue openedgettalong/hexapdf

Content processor show_text NoMethodError for NilClass

(this is the same context as the colorspace issue we dealt with, in my redactor ) The place it is failing on the line noted below. The log lines saying 'show_text_string' have the value of str. It is when the string appears to be a single dash '-'.

rails_1_35fba6963d05 | [DEBUG] [fd6955] REDACT: show_text string  1RWH__WKLV_LV_IOLSSHG_IURP_RXU_GLVFXVVLRQ_LQ_WRGD\
rails_1_35fba6963d05 | [DEBUG] [fd6955]  V_PHHWLQJ___,W_QHHGV_WR_EH_WKLV_ZD\_WR_PDNH_WKH_WLPHOLQH_ZRUN_DQG_VR_SHRSOH_JH
W_WR nil? false blank? false
rails_1_35fba6963d05 | [DEBUG] [fd6955] REDACT: show_text string  NQRZ_ZKDW_FRPPLWWHHV_WKH\_DUH_PHPEHUV_RI_EHIRUH_WKRVH
_FRPPLWWHHV_PHHW_ nil? false blank? false
rails_1_35fba6963d05 | [DEBUG] [fd6955] REDACT: show_text string  - nil? false blank? false
rails_1_35fba6963d05 | [INFO ] [fd6955] method=GET  path=/story/
ontroller=story/articles action=view status=500  error='NoMethodError:undefined method `[]' for nil:NilClass'  duration=
881.97 db=0.00 uid=1  full_url=http://t
rails_1_35fba6963d05 | [FATAL] [fd6955]
rails_1_35fba6963d05 | [FATAL] [fd6955] NoMethodError (undefined  method `[]' for nil:NilClass):
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:48:in `show_text'
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:135:in `block in redact'
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:130:in `with_index'
rails_1_35fba6963d05 | [FATAL] [fd6955]    app/services/hexrep.rb:130:in `redact'

hexrep.rb


class Hexrep < HexaPDF::Content::Processor

  def initialize(page)
    super()
    @canvas = page.canvas(type: :overlay)
    @boxeslist = []
    @emails = []
    @strings = []
  end


  def show_text(str)

    begin
      strings = decode_text(str)
      @strings << strings
      boxes = decode_text_with_positioning(str) # Exception happens on this line, line 48
      boxes.each do |box|
        @boxeslist << box
      end
    rescue NoMethodError => ex
      Rails.logger.warn "PDF redact failure on string #{str}"
    end

  end

 #.........

created time in 2 months

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

@efatsi Thanks for the response and no problem!

efatsi

comment created time in 3 months

issue commentgettalong/hexapdf

Unexpected error: HexaPDF::MalformedPDFError

Thanks @gettalong, unfortunately I'm not in a position to test this out at the moment, I never had direct access to the malformed PDFs, only complaints from the software users and error logs.

efatsi

comment created time in 3 months

more