Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The website didn't load for me. So here it is: https://web.archive.org/web/20200615193041/https://lucperkin...

Also, I'd like to add one database to the list (I work there for 3 weeks now): TriplyDB [0]. It is making linked data easier.

Linked data is useful for when people of different organizations want a shared schema.

In many commercial applications one wouldn't want this, as data is the valuable part of a company. However, scientific communities, certain government agencies and other organizations -- that I don't yet know about -- do want this.

I think the coolest application of linked data is how the bio-informatics/biology community utilizes it [1, 2]. The reason I found out at all is because one person at Triply works to see if a similar thing can be achieved with psychology. It might make conducting meta-studies a bit easier.

I read the HN discussions on linked data and agree with both the nay sayers (it's awkward and too idealistic [4]) and the yay sayers (it's awesome). The thing is:

1. Linked data open, open as in open source, the URI [3] is baked into its design.

2. While the 'API'/triple/RDF format can be awkward, anyone can quite easily understand it. The cool thing is: this includes non-programmers.

3. It's geared towards collaboration. In fact, when reading between the lines, I'd argue it's really good for collaboration between a big heterogeneous group of people.

Disclaimer: this is my own opinion, Triply does not know I'm posting this and I don't care ;-) I simply think it's an interesting way of thinking about data.

[0] triply.cc

[1] A friend of mine once modeled some biochemistry part of C. Elegans from linked data into petrinets: https://www.researchgate.net/publication/263520722_Building_...

[2] https://www.google.com/search?client=safari&rls=en&q=linked+... -- I quickly vetted this search

[3] I still don't know the difference between a URI and URL.

[4] I think back in the day, linked data idealists would say that all data should be linked to interconnect all the knowledge. I'm more pragmatic and simply wonder: in which socio-technological context is linked data simply more useful than other formats? My current very tentative answer is those 3 points.



I recently had to deal with some RDF data expressed as N-triples. So, naturally I loaded it into a proper triplestore and embraced the whole W3C RDF universe and started teaching myself SPARQL, right? Nah, instead I just translated that shit into CSV and loaded it into Postgres, then post-processed it into a relational schema that I can actually understand and query with a language that other people at my company also know. The RDF was just a poorly specified way of communicating that schema to me, along with a data format that's no better than a 3-column CSV. Great stuff from the Semantic Web folks here, real powerful technology.

Edit: Also, to answer your question, the difference between a URL (Uniform Resource Locator) and a URI (Uniform Resource Identifier) is that the URL actually points to something, an object at a particular location, and you can paste it into your web browser to view that something. A URI just uses a URL-like scheme to represent identifiers, such that your domain and directory structure provide a kind of namespace that you control. But as long as it follows the format, your URI can contain literally anything, it doesn't have to be human-readable or resolve to anything in a web browser. It might as well be mycompany_1231241542345.


> ...I just translated that shit into CSV and loaded it into Postgres, then post-processed it into a relational schema that I can actually understand and query with a language that other people at my company also know. The RDF was just a poorly specified way of communicating that schema to me, along with a data format that's no better than a 3-column CSV. Great stuff from the Semantic Web folks here, real powerful technology.

I'm not sure what's the point you're trying to make, that is exactly what RDF is for! It's not an implementation technology, it's purely an interchange format. You should not be using a fully-general triple store, unless you really have no idea what the RDF you work with is going to look like.

SPARQL is the same deal; it's good for exposing queryable endpoints to the outside world, but if your queries are going to a well-defined relational database with a fixed schema and the like, you should just translate the SPARQL to SQL queries and execute those.


Well I’m glad to hear that my solution was sane, but I just don’t see what technological innovation was contributed by the RDF. The file was canonical N-triples, AKA a CSV with space instead of comma. The predicates establish relationships between subjects and objects, but those relationships could be one to one, one to many, or many to many. Should a given predicate be modeled relationally as a column, or with some kind of join table? I have no idea from the RDF. Say, those objects, are the types I see attached to the values the only types that could possibly appear for that predicate? Who knows! Sure, the data has been interchanged, but the “format” is so generic that it’s useless. Why not just give me a 3-column CSV and tell me to figure it out on my own, rather than pretend that RDF provided some improvement?


Thanks for your comment on it by the way. I'm still in the phase of gathering what everyone thinks of it. I've noticed that RDF seems a bit polarizing. I have the suspicion that people who feel neutral about it also don't feel the need to chime in.


There are RDF schema formats that would tell you these things. Also often the format/ontology/etc. is known informally anyway.


Ah, one has to resolve, the other doesn't have to. Thanks for the explanation.

It seems you're proving my point: you didn't need to collaborate outside of your company. So you picked the path of least resistance and that's totally what I would do too.

But what if you work together with a plethora of academic institutions and then decide that you want to keep options open such that other disciplines can connect to you and you can connect to them, automatically?

You could create a consortium and create a shared Postgres schema (among other things), since everyone knows it. Or you could just put your linked data online with a web page, no consortium needed. Anyone who wants to link to you, they can. And if they publish their data, then by no effort of your own, your data is enriched as well.

I view linked data as a DSL of some sorts. DSLs are amazing, except if you try to force fit them into something they shouldn't be force fitted into. You are giving an argument that one should not force fit it within one organization.

And I agree with that since that's not where linked data shines. Just like SQL doesn't shine at making a raytracing engine, but that doesn't prevent anyone [1] ;-)

That's my current view anyway (again, 3 weeks in, I mostly dealt with NodeJS and ReactJS issues at the moment).

Also, a lot of SPARQL looks like SQL (to my new eyes anyway). Here's a tutorial on it (for a basic feel, watch until episode 5 -- takes about an hour, or watch just the first episode to get a feel): https://www.youtube.com/watch?v=nbUYrs_wWto&list=PLaa8QYrMzX...

[1] https://news.ycombinator.com/item?id=21218144




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: