profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/willglynn/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.

m4b/goblin 591

An impish, cross-platform binary parsing crate, written in Rust

jasoncharnes/run.rb 518

Run Ruby in the browser using WebAssembly

willglynn/attribute_normalizer 1

Adds the ability to normalize attributes cleanly with code blocks and predefined normalizers

willglynn/activerecord-sqlserver-adapter 0

SQL Server Adapter For Rails

willglynn/amazon-vpc-cni-k8s 0

Networking plugin repository for pod networking in Kubernetes using Elastic Network Interfaces on AWS

willglynn/artichoke 0

💎 Artichoke is a Ruby made with Rust

willglynn/aws-sdk-for-java 0

Official mirror of the AWS SDK for Java. For more information on the AWS SDK for Java, see our web site:

willglynn/binaryen 0

Compiler infrastructure and toolchain library for WebAssembly, in C++

willglynn/braintree_ruby 0

braintree ruby client library

issue commentm4b/goblin

Support for text based stub libraries (.tbd) files

Oh I actually think I like this idea, I believe you’re proposing eg:

pub struct Tbd {
  pub bytes: &[u8]
}

then as first approximation (Maybe final!) we can direct users to parse the file using eg your crate if you publish it or, etc?

That's exactly what I'm proposing.

I guess the advantage to returning the TBD(TBD) variant is just to signal to the user we didn’t encounter something expected (but still leave it up to them how to parse it further...)

Yes. By introducing the enum variant, goblin effectively says we know this data is related to mach-o. That could be a useful signal to end-users and a nudge that they should consider doing something with the data. It does add an API break though. I'm unsure if you are willing to do that with the minimal added benefit considering there would be no built-in support for reading the TBD content. Although we could potentially throw something useful in there, such as sniffing the YAML document versions and exposing the TBD version via the goblin API.

indygreg

comment created time in 9 hours

issue commentm4b/goblin

Support for text based stub libraries (.tbd) files

n order to avoid a future API break, do you think it would be worth declaring a new TBD type - initially an alias or thin wrapper around &'a str so that in the future if we wanted to add TBD parsing to goblin or hang additional methods off that type, we could do so without an API break? Or does YAGNI apply?

Oh I actually think I like this idea, I believe you’re proposing eg:

pub struct Tbd {
  pub bytes: &[u8]
}

then as first approximation (Maybe final!) we can direct users to parse the file using eg your crate if you publish it or, etc?

On the other hand the user could just do all the same by attempting such a parse if they encounter an Unknown variant?

I guess the advantage to returning the TBD(TBD) variant is just to signal to the user we didn’t encounter something expected (but still leave it up to them how to parse it further...)

indygreg

comment created time in a day

issue closedm4b/goblin

Finding path to dll import locations

Hello,

Is there a way to use goblin to not only read a dll name but get the paths of dll files imported. I'm trying to implement a folder dependency walker that parses the dll files to create a hashmap of dll name and path to it's location as shown in this repository main.rs: https://github.com/ryancinsight/depwalker

Any ideas or suggestions would be greatly appreciated.

closed time in a day

ryancinsight

issue commentm4b/goblin

Finding path to dll import locations

Thanks!

ryancinsight

comment created time in a day

issue commentm4b/goblin

Support for text based stub libraries (.tbd) files

If you are proposing goblin::Object::parse() would return a goblin::Object::Mach TBD(&'a str) variant with the unparsed YAML content, that seems like a reasonable first step to me. The API docs could potentially advertise a 3rd party crate for parsing if we wanted.

In order to avoid a future API break, do you think it would be worth declaring a new TBD type - initially an alias or thin wrapper around &'a str so that in the future if we wanted to add TBD parsing to goblin or hang additional methods off that type, we could do so without an API break? Or does YAGNI apply?

indygreg

comment created time in a day

issue commentm4b/goblin

Support for text based stub libraries (.tbd) files

So i'm wondering if a simple, more flexible approach here is just to detect the tbd (glancing at your code, this seems somewhat complicated/tedious), and just return a TBD(&'a str) for the variant? That way downstream users can parse, do whatever they want with it, and we can sidestep the whole external deps/serde/yaml situation in goblin?

indygreg

comment created time in a day

issue commentm4b/goblin

Support for text based stub libraries (.tbd) files

I decided to shave a yak today and I implemented a minimal crate for parsing .tbd using serde-yaml. Unfortunately, I learned when doing this that yaml-rust doesn't expose tags on YAML documents/hashmaps (https://github.com/dtolnay/serde-yaml/issues/147), so I had to do something very ugly to parse the YAML (https://github.com/indygreg/PyOxidizer/blob/ca1c80bbcce64fc58d5f9a9806055bd32ff7a567/text-stub-library/src/lib.rs#L71). It works. But I feel dirty.

This would seemingly be more ammunition for not using an uncontrolled 3rd party crate for the .tbd parsing. The YAML syntax in use in .tbd files seem simple enough that a simple parser could likely be cooked up. (Although TBH I'm unsure if I have the stomach to implement a custom parser for goblin. Serde yes, custom parser I don't know.)

FWIW I validated my .tbd parser by parsing every .tbd file in every SDK in every Xcode version installed on the GitHub Actions workers and it worked. So I have some degree of confidence that the schema is correct, at least for version 3 and 4 of the format. (Version 4 seems to have been introduced for the macOS 11 SDK AFAICT in order to support multi-arch libraries.) Feel free to steal the Rust structs if you want. The code is MPL 2.0. But I give you permission to relicense the code in that crate under whatever license you want. All the attribution I need is maybe a reference in the commit message.

indygreg

comment created time in a day

issue commentm4b/goblin

Support for text based stub libraries (.tbd) files

@indygreg first off, thank you for the kind words! :)

So re .tbd files, the short is that I would/could be persuaded to include them in goblin. The long is follows, with some pros and cons, and rationale.

So generally speaking, in the past, I've opted for:

  1. very little dependencies
  2. including as submodules in goblin things which are directly related to "binaries" or artifacts in the linking phase (including, ideally, both reading and writing of those); today, we have archive, elf, mach, and pe (and strtab as a helper in all three).

These were the general motivating "directives" of the library. So e.g., when there was some push for wasm support, i was 50/50 on including/not including, creating own submodule, or pulling in external, etc. We may still do that!

But on the subject of .tbd's, while they are a textual format, they're actually treated effectively as a drop in replacement as dynamic libraries on darwin based systems by the linker, so there is definitely a very strong case for just adding another variant in the enum based parser.

On that note, i think it could be reasonable to add the variant, e.g. here:

#[derive(Debug)]
#[allow(clippy::large_enum_variant)]
/// Either a collection of multiple architectures, or a single mach-o binary
pub enum Mach<'a> {
    /// A "fat" multi-architecture binary container
    Fat(MultiArch<'a>),
    /// A regular Mach-o binary
    Binary(MachO<'a>),
    /// A Tbd file
    TBD(TBD<'a>), // or maybe a `&'a str` ?
}

As noted by @willglynn a way to do this might be to have an external tbd crate, which we conditionally include with a new feature ("tbd", perhaps).

There are two major problems with this approach:

  1. I generally believe that types should be invariant across feature cfg's (and i have violated this elsewhere, notably in scroll's error enum for no_std)
  2. I like to be in control of external deps (I've seen dependency spray explode in other projects, where every new update seems to add some new random crate for doing something) (also don't get me wrong, I love cargo crates system and deps!)

so the above two are the major problems I see in adding support for .tbd using an external crate, 1. types whose size/variants change across cfgs, 2. non-control of deps.

In particular, parsing/loading the .tbd might involve:

  1. serde
  2. yaml crate

which will likely massively explode the amount of deps. Maybe this can be "fixed" with a cfg on that variant, etc., but this is in violation of 1., and it just doesn't seem very clean to me.

Also, having had experience with cfgs and rust features, used like this I do believe it's an anti-pattern, and can cause unintentional bloat and/or unusual recompiles when features don't unify in a workspace with multiple crates.

Anyway, these are some of my fears, and i'm heavily biased towards not including parsing functionality in goblin itself for tbd; however, as I said, I could be persuaded.

Some random thoughts/questions:

  1. .tbd files are fairly simple; could we write a custom "tbd" parser using scroll
  2. for starters, if we did 1., I think the user will only really want/care about the architectures + symbol list the .tbd file provides, which should be fairly straightforward?
  3. On that note, what particular data/information should we extract from the .tbd in goblin; or put another way, if we're just doing let tbd = yaml::deserialize(tbd_file)? why does/would this need to be in goblin? maybe a good compromise might be to add a TBD hint then instead, and let user deserialize using yaml/whatever crate they want? And if we aren't just returning the yaml-in-struct form, maybe this is a semi-compelling argument to just do newline parsing + roll-your-own-tbd parser inside goblin to get out e.g, the symbols + architecture?

So to summarize:

  1. I agree it would be nice to include/detect/parse? .tbd files (the granularity is what we do exactly :) )
  2. I would prefer not to have conditional cfg's + an external tbd crate that i'm not in control of
  3. if all we're doing is deserializing the yaml, so we can return e.g, the symbols and arch + some other info (I don't think there's much else besides the version and the lib that provides it), maybe it's better to do parsing ourselves and/or just return a &'a str to let the user do parsing themself with whatever they want, or yaml structured deserializer via serde, or etc. the downside is this could be brittle? i'm not sure

Thoughts appreciated :D

indygreg

comment created time in a day

push eventm4b/goblin

unknown

commit sha 70a79bf5dee271c2ffae2fc98331955470e406d5

tests.elf: Replace unchecked unsafe slicing with regular slicing

view details

push time in a day

PR merged m4b/goblin

Replace unchecked unsafe slicing with regular slicing in elf test

This test performs the equivalent of subslicing a slice through unsafe code and avoids the bounds checking:

    let hashtab: &[u8] = unsafe {
        let addr = base.as_ptr().add(hash_section.sh_offset as usize);
        let size = hash_section.sh_size as usize;
        slice::from_raw_parts(addr, size)
    };

Since it's a test, I expect the bounds checking to not matter (I also expect that any potential overhead is dwarfed by everything else this function is doing like the vec allocation or the Elf parsing)

+5 -7

0 comment

1 changed file

nico-abram

pr closed time in a day

issue commentraceintospace/raceintospace

Line drawn in wrong spot in Step Failure

I wonder if a better idea might be to add: [astronaut name] FAILED ROLL

or something like that, in the gray area beneath "1 VS 91".

peyre

comment created time in a day

issue openedm4b/goblin

Support for text based stub libraries (.tbd) files

Thank you for maintaining goblin. It is a joy being able to open binary files from any platform and analyze their contents without having to install a myriad of tools to support various binary formats.

I recently found myself wanting to parse text based stub libraries (.tbd files) from Rust and was curious if you would be receptive to including support in goblin. (I might contribute support myself.)

.tbd files are essentially descriptors of mach-o dylibs. Apple uses them in their SDKs to describe dylibs. I think the motive behind these files is it enables linkers to do their job using a minimal representation of the dylib without having to ship full dylibs in SDKs. This helps reduce the size of the SDK.

I'm unsure if there is a canonical specification for this file format. However, there's a comprehensive inline comment in the LLVM source code at https://github.com/llvm/llvm-project/blob/main/llvm/lib/TextAPI/MachO/TextStub.cpp that defines it.

The file format is YAML. If I were to implement support for parsing these files in Rust, I'd likely define a bunch of Rust structs representing the various components and then use serde for (de)serialization from/to YAML. If we did this in goblin, we'd pick up a handful of new crate dependencies. I'm unsure if that would be desirable. Of course, we could always define a conditional crate feature to toggle support for text based stub libraries.

Given that text based stub libraries describe mach-o libraries and are used widely on Apple platforms, I can make a compelling case for their inclusion in goblin as a supported format. I was unable to find any Rust crates for parsing this file format on crates.io, so there appears to be a market need.

Are you interested in supported text based stub libraries in goblin? If so, do you have any thoughts on YAML parsing and new crate dependencies?

created time in a day

startedpi0neerpat/unlock-protocol-bot

started time in 3 days

issue commentm4b/goblin

Finding path to dll import locations

Maybe you can take inspiration from crosstool-ng's ldd script: https://gist.github.com/jerome-pouiller/c403786c1394f53f44a3b61214489e6f

ryancinsight

comment created time in 7 days

created tagwillglynn/pdb

tag0.7.0

A parser for Microsoft PDB (Program Database) debugging information

created time in 7 days

push eventwillglynn/pdb

Jan Michael Auer

commit sha 5f07022b0188a4c9c39ff9d23270b1631c223631

Release 0.7.0

view details

push time in 7 days

push eventwillglynn/pdb

Arpad Borsos

commit sha aa858ed385a490c6762684c16a078f1fc480b754

fix: Allow parsing modules that have no symbols The expected CV Signature of a ModuleInfo seems to be a part of the symbols, meaning that we shouldn’t read it when the symbols_size is 0.

view details

Arpad Borsos

commit sha d4bf55959d23a70c196091569fbecaf2e047d8b9

wording

view details

Jan Michael Auer

commit sha af977cf2d68ef0eeb4d0788ebcfd74ce68836b8b

Merge pull request #102 from Swatinem/fix/module-zero-symbols fix: Allow parsing modules that have no symbols

view details

push time in 7 days

PR merged willglynn/pdb

fix: Allow parsing modules that have no symbols

The expected CV Signature of a ModuleInfo seems to be a part of the symbols, meaning that we shouldn’t read it when the symbols_size is 0.

+14 -15

0 comment

2 changed files

Swatinem

pr closed time in 7 days

Pull request review commentwillglynn/pdb

fix: Allow parsing modules that have no symbols

 impl<'s> ModuleInfo<'s> {     pub fn symbols(&self) -> Result<SymbolIter<'_>> {         let mut buf = self.stream.parse_buffer();         buf.truncate(self.symbols_size)?;-        buf.parse_u32()?;+        if self.symbols_size > 0 {+            let sig = buf.parse_u32()?;+            if sig != constants::CV_SIGNATURE_C13 {+                return Err(Error::UnimplementedFeature(+                    "Unsupported module info format",

This could probably be reworded to indicate the symbol data.

Swatinem

comment created time in 7 days

PR opened willglynn/pdb

fix: Allow parsing modules that have no symbols

The expected CV Signature of a ModuleInfo seems to be a part of the symbols, meaning that we shouldn’t read it when the symbols_size is 0.

+13 -2

0 comment

1 changed file

pr created time in 7 days

fork Swatinem/pdb

A parser for Microsoft PDB (Program Database) debugging information

https://docs.rs/pdb/

fork in 7 days

push eventraceintospace/raceintospace

Ignaz Forster

commit sha 2118fd0e8978e713437fbe764c94dae61a8d5d2a

Reintroduce fullscreen support The option was just ignored - parse it again to set SDL_FULLSCREEN. Previously the SDL_Quit() call was missing. This function should always be called during shutdown independent of this commit, but in this case it also resulted in the resolution not being reset on exit. Also see issue #359.

view details

Leon Baradat

commit sha cdfe999390153fc886a93d6f9c9ecf9fc0ee118c

Merge pull request #517 from laenion/master Reintroduce fullscreen support

view details

push time in 7 days

PR merged raceintospace/raceintospace

Reintroduce fullscreen support

The option was just ignored - parse it again to set SDL_FULLSCREEN.

Previously the SDL_Quit() call was missing. This function should always be called during shutdown independent of this commit, but in this case it also resulted in the resolution not being reset on exit.

Also see issue #359.

+7 -2

0 comment

1 changed file

laenion

pr closed time in 7 days

issue openedm4b/goblin

Finding path to dll import locations

Hello,

Is there a way to use goblin to not only read a dll name but get the paths of dll files imported. I'm trying to implement a folder dependency walker that parses the dll files to create a hashmap of dll name and path to it's location as shown in this repository main.rs: https://github.com/ryancinsight/depwalker

Any ideas or suggestions would be greatly appreciated.

created time in 7 days

issue openedm4b/goblin

elf: Implement parsing of symbol versioning

See https://github.com/PyO3/maturin/pull/436/files#r581158722

created time in 8 days

push eventraceintospace/raceintospace

Leon Baradat

commit sha 09ccadfd72cb650dd79619cd0db65304c49188fe

Made the Saturn V even a little bit less fat

view details

push time in 8 days

fork jech/turn

Pion TURN, an API for building TURN clients and servers

fork in 9 days

issue commentraceintospace/raceintospace

B-Kicker doesn't look right

Might be a slight change to the palette would fix this - don't know.

peyre

comment created time in 9 days

push eventraceintospace/raceintospace

Leon Baradat

commit sha 394d7a9693245c1ea4269db31c1524fe5133d69c

Fixed (Orbit) warning never recurring

view details

push time in 9 days

issue commentraceintospace/raceintospace

DM in orbit alert

Oh shoot, if you say Yes to that message, it will never come up again in a game. That's fine for while you're in Future Missions, but once you hit Continue there, it should reset so next time you schedule a mission you'll be hit with the warning if you select an (Orbit) mission.

rnyoakum

comment created time in 9 days