profile
viewpoint
Konrad Höffner KonradHoeffner @AKSW & @IMISE Leipzig, Germany http://aksw.org/KonradHoeffner Research Assistant @IMISE.

GeoKnow/LinkedGeoData 108

OpenStreetMap for the Semantic Web

AskNowQA/cubeqa 16

CubeQA—Question Answering on Statistical Linked Data

gerbsen/dbpedia-lucene-index 4

index for all resources with labels, uris, surface forms, images etc.

hitontology/ontology 2

The Health IT Ontology.

IMISE/imise-classicthesis 2

Optimal LaTeX template for Bachelor and Master Theses

KonradHoeffner/dotfiles 1

my linux configuration files and scripts

GERZAC1002/BeLL 0

Contains the files of my special learning achievement

hitontology/database-frontend 0

HITO Flask-SQLAlchemy Database Frontend

push eventhitontology/ontology

Konrad Höffner

commit sha c4d7bf0072d84da813dd7bf38f3694c087c81978

Add LIMES all catalogue mapping file early draft. Part of #94.

view details

push time in 4 days

issue commenthitontology/ontology

Kataloge verlinken

Siehe https://github.com/dice-group/LIMES/issues/255. Momentan verwende ich die Option, alle matches mit exakt 1.0 wegzuwerfen.

KonradHoeffner

comment created time in 4 days

issue openeddice-group/LIMES

How to map a single dataset to itself?

Is it possible to use LIMES with more than two sources which are all included in the same file? The sources should be mapped to each other but of course I don't want to map a source to itself and I also don't want to map a source to itself. To clarify with an example, lets say I have a class :Country with many instances and each country has a population of individuals. All of this data is in the same file countries.ttl. Now I want to find out, which individuals live in more than one country.

:Germany a :Country;
 rdfs:label "Germany".

:Azerbaijan a :Country;
 rdfs:label "Azerbaijan".

:person123 a :Person;
 rdfs:label "Alex Müller";
 :country :Germany.

:person 456 a :Person;
 rdfs:label "Alex Mueller";
 :country :Azerbaijan.

This can be done in the following manner, declaring source and target alike:

    <SOURCE>
        <ID>c1</ID>
        <ENDPOINT>countries.ttl</ENDPOINT>
        <VAR>?c1</VAR>
        <PAGESIZE>-1</PAGESIZE>
        <RESTRICTION>?c1 a :Person; :country ?x.</RESTRICTION>
        <PROPERTY>rdfs:label AS nolang->lowercase->regularalphabet RENAME label</PROPERTY>
        <TYPE>TURTLE</TYPE>
    </SOURCE>

    <TARGET>
        <ID>c2</ID>
        <ENDPOINT>countries.ttl</ENDPOINT>
        <VAR>?c2</VAR>
        <PAGESIZE>-1</PAGESIZE>
        <RESTRICTION>?c2 a :Person; :country ?y.</RESTRICTION>
        <PROPERTY>rdfs:label AS nolang->lowercase->regularalphabet RENAME label</PROPERTY>
        <TYPE>TURTLE</TYPE>
    </TARGET>

   <METRIC>trigrams(c1.label,c2.label)</METRIC>

However this will generate a false match for every person to itself, and also it will also match each pair twice in both directions. I would like to add a restriction like "STR(?x) < STR(?y)" but it seems like one cannot reference variables from the source in the restriction of the target. A workaround is to throw away all matches with score exactly 1.0 but this is wasteful on resources and also discards correct matches that happen to be exactly equal.

    <ACCEPTANCE>
        <THRESHOLD>1</THRESHOLD>
        <FILE>exact.ttl</FILE>
        <RELATION>owl:sameAs</RELATION>
    </ACCEPTANCE>
    
    <REVIEW>
        <THRESHOLD>0.8</THRESHOLD>
        <FILE>close.ttl</FILE>
        <RELATION>owl:sameAs</RELATION>
    </REVIEW>

Another way is to perform postprocessing to remove all duplicate and self matches but that seems to be inefficient in both developer and execution time.

Lastly, I could write a script which would enumerate all n*(n-1)/2 unique non self-matching pairs and generate as many limes configuration files but that has its own problems.

Is there any way to solve this task efficiently using LIMES or do I need to use one of the mentioned imperfect options?

created time in 4 days

PR opened dice-group/LIMES

Speed up tests

Increase surefire plugin fork count from 1 to 4 to speed up unit tests.

+1 -1

0 comment

1 changed file

pr created time in 4 days

push eventKonradHoeffner/LIMES

Konrad Höffner

commit sha 3e3fd17f4b80846206ddf75141648a9be962100f

Speed up tests Increase surefire plugin fork count from 1 to 4 to speed up unit tests.

view details

push time in 4 days

fork KonradHoeffner/LIMES

Link Discovery Framework for Metric Spaces.

https://limes.demos.dice-research.org/

fork in 4 days

issue openedsnikproject/docker

create docker-compose.yml

created time in 4 days

push eventsnikproject/ontology

Konrad Höffner

commit sha a9b61c3c76fe06850ed1ea68779caef9e5b6fa3d

Fix meta.ttl.

view details

Konrad Höffner

commit sha 9fb2cee31be36681da5b99f9d6dcb990af36d7ef

Update setup notes in readme.

view details

push time in 4 days

issue commentopenlink/virtuoso-opensource

How to best execute arbitrary SQL on first startup?

@pkleef: Is it possible to share the current state of the draft? I checked the community forum but didn't find it yet.

KonradHoeffner

comment created time in 4 days

issue commentopenlink/virtuoso-opensource

Docker image with virtuoso and loaded data

@HughWilliams: Is there a rough estimate of when this will be implemented? I looked into the community forum but didn't find it on first glance.

ccolonna

comment created time in 4 days

issue openedopenlink/virtuoso-opensource

Error in documentation of DB.DBA.RDF_GRAPH_GROUP_CREATE

The is_silent parameter raises an error when set to 0 as expected given the parameter name. However the web page http://docs.openlinksw.com/virtuoso/fn_rdf_graph_group_create/ as cited below is written the other way around:

<div class="refnamediv"><p>DB.DBA.RDF_GRAPH_GROUP_CREATE — Creates graph group.</p>

<h3>is_silent</h3> <p>1 or 0. When set to 1, and there is already group with the given name, then raises the error. When is set to 0 then will not show error message.</p>

created time in 4 days

push eventhitontology/lodview

Konrad Höffner

commit sha d44282d86d37c859f5ed3cd2e3984cf946dd0125

Use serial garbage collector to reduce memory consumption.

view details

push time in 4 days

push eventsnikproject/snik-lodview

Konrad Höffner

commit sha c85d6ee1b1502e1047656772f6a637274d048e9c

Use serial garbage collector to reduce memory consumption.

view details

push time in 4 days

PR opened LodLive/LodView

Use serial garbage collector to reduce memory consumption.

Memory consumption of the default and serial garbage collectors are compared in https://github.com/LodLive/LodView/issues/54.

+1 -0

0 comment

1 changed file

pr created time in 4 days

create barnchKonradHoeffner/LodView

branch : optimize-docker-memory

created branch time in 4 days

issue commentLodLive/LodView

docker memory consumption

Testing with visualvm

$ docker run --network=host --privileged -d -e CATALINA_OPTS="$CATALINA_OPTS -Dcom.sun.management.jmxremote.rmi.port=9090 -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=9090 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.local.only=false -Djava.rmi.server.hostname=localhost" lodview
KonradHoeffner

comment created time in 4 days

issue openedLodLive/LodView

docker memory consumption

Using the current master branch, the docker container uses more than 500 MB of RAM even at the start when nothing is happening:

$ docker build -t lodview .
$ docker run --network=host -d lodview
$ docker stats
CONTAINER ID   NAME            CPU %     MEM USAGE / LIMIT     MEM %     NET I/O   BLOCK I/O     PIDS
413bd2f64fca   sweet_leavitt   0.00%     529.5MiB / 15.42GiB   3.35%     0B / 0B   57.6MB / 0B   36

Is there a way to reduce the usage?

created time in 4 days

push eventsnikproject/ontology

Konrad Höffner

commit sha f699c55b922a382a695e6e7a396ef90144dcf93d

Fix combine script.

view details

push time in 5 days

push eventsnikproject/ontology

Konrad Höffner

commit sha cb05f03c22287c7c7dacf453d801392985b69558

Add gource repository video scripts.

view details

push time in 5 days

push eventhitontology/ontology

Konrad Höffner

commit sha 7cbf24e56c7265dd37f9a48f72b533927d66ffc7

Add gource repository video scripts.

view details

push time in 5 days

issue openedacaudwell/Gource

Font size only applies to date and time and does not scale with resolution

There are two problems with the font-size that both make the text too small to read on high resolutions such as 3840x2160 (4k):

  1. The font size does not scale with resolution. So if I create a 4k video and upload it on YouTube and the users watch it in 720p, they cannot read anything.
  2. The font size parameter --font-size only applies to the date and time at the top, so there is no way to manually increase the font size of the text in edges of the graph and the user names.

Minimum Working Example

Use any repository and compare:

  • gource -320x200 --font-size 1
  • gource -3840x2160 --font-size 1
  • gource -320x200 --font-size 100 gource-320-200-font-size-1
  • gource -3840x2160 --font-size 100

gource-3840-2160-font-size-100

created time in 5 days

push eventhitontology/ontology

Konrad Höffner

commit sha a6f7b6dfb13ff13889e77b71e37a6b167c696bdd

Add gource repository video scripts.

view details

push time in 5 days

push eventhitontology/ontology

Konrad Höffner

commit sha 3fa3120dedd777ec51df82188bc6622d9c566714

Add gource repository video scripts.

view details

push time in 5 days

push eventhitontology/ontology

Konrad Höffner

commit sha d8cf080d36b02cdd204e4182811a8159f664a4f6

Add gource repository video scripts.

view details

push time in 5 days

push eventhitontology/ontology

Konrad Höffner

commit sha 3cc420ec3ce4f48dbae3a094b5138e592aecdbf4

Add gource repository video scripts.

view details

push time in 5 days

push eventsnikproject/ontology

Konrad Höffner

commit sha 13e1f76e466fc2e77c28dfdb6d9ac6b72accc849

Add gource repository commit video scripts.

view details

push time in 5 days

issue commentsnikproject/snik-graph

Search auto complete and suggestions

Experimented with algolia auto complete, see https://www.algolia.com/doc/ui-libraries/autocomplete/introduction/getting-started/. The branch is https://github.com/snikproject/snik-graph/tree/feature-autocomplete. However testing shows that the current Fuse.js is search is not responsive enough for a good user experience with auto complete. It can take several seconds to initialize the search index on the first search and then again several seconds on each search even on a Intel i7-9700. While that is not the absolute newest and best, many users probably have worse CPUs.

KonradHoeffner

comment created time in 5 days

push eventsnikproject/snik-graph

Konrad Höffner

commit sha 2e4059b3d9f012b14e27531669e633de91e11d1d

Update dependencies.

view details

push time in 5 days

push eventsnikproject/snik-graph

Konrad Höffner

commit sha 2f64b4a1c4609d5dc9e269a7122bebc1f3008f34

Further autocomplete testing.

view details

push time in 5 days

push eventsnikproject/snik-graph

Konrad Höffner

commit sha 2e4059b3d9f012b14e27531669e633de91e11d1d

Update dependencies.

view details

Konrad Höffner

commit sha 5aef7ab145c08af9b7168e26790c7c7c7b58fb23

Testing algolia autocomplete.

view details

push time in 5 days

more