Context Aware MetaTag API


(Mathias Schreiber) #1

Discussion Topic

The way to set tags up to v8 is an one-way API. You can add a meta tag but not programatically remove or overwrite it.
A current proposal for a new API has been merged into core in early Nov. 2017 and raised some concern afterwards.
https://forge.typo3.org/issues/81464

The issues addressed now are that we only cover 80% of the standard (even though solving 99% of the problems integrators have with it) so before diving into the topic again I’d like to propose the following solution:

Context Aware Meta Tags

Each meta tag will be flagged as one of 3 possible states:

  • Static Meta Tags
  • Content Meta Tags
  • Special Meta Tags

In order to understand the reasoning we need to try to understand the problems we try to solve:
An integrator sets a variety of meta tags during the templating process.

  • generator
  • viewport
  • author
  • og:image
  • og:title
  • robots
    just to name a few.

If these are set at templating level these will always get rendered.
The tricky part is when an extension has a detail view, we’ll use EXT:news as an example here.
EXT:news might add its one og:image as well as its own og:title and an author.
These meta tags now get rendered multiple times with various levels of impact.
While an additional author might not be a problem, multiple og:image tags indeed are.
Workaround like telling the editor not to upload an og:image into the pages properties are shaky at best (and most of the time pointless because the integrator is smart and provides a fallback.

Ideally EXT:news would instruct TYPO3 to remove all Content Meta Tags and set its own.
In our case a call to $pageRenderer->removeContentMetaTags() would remove:

  • author
  • og:image
  • og:title

Note that generator and viewport have been kept. This is because there is no reason to overwrite these programatically.
EXT:news can now add its meta tags

In a special case our integrator what’s to set robots to noindex, nofollow and thus has to call $pageRenderer->removeSpecialMetaTags() in order to do so.

Pro

  • It solves real-world problems rather than addressing academic issues
  • The API is concise and offers little room for discussion
  • The API is simply and easy to use

Con

  • tbd

Remarks and notes

Organizational

Topic Initiator: Mathias Schreiber
Topic Mentor: Benjamin Kott


(Georg Ringer) #2

I like the proposal in general but the naming/grouping is hard to understand. E.g “robots” is also kind of content related and the wording “special” is hard to understand. Same for “viewport”. There could be e.g. a forum done by an extension which needs to change the viewport for whatever reason.

There should also a method ::remove($name) to remove a given meta tag.


(Markus Klein) #3

As discussed in person with @dermattes1 at the Wiesbaden sprint, I see only one way to make this a futureproof and usable solution.

The non-html5 metatags like OG, Schema, etc usually put an extra amount semantical information into the data. (Example is the positional information required for og:image:width as it belongs to the preceding og:image)
In order to make such information accessible for manipulation, we need some kind of manager for those concepts.

I hereby propose to create managers for OG, Schema.org and HTML5. Those managers hand over the generated metatags to the PageRenderer when the page is rendered. This way we can store semantic information (like image + width for OG) in the manager, which finally knows best how to render this information correctly as metatags.
Extensions can then access those managers to modify information - like replacing stuff.


(Frans Saris) #4

Just for future reference: The ability to have multiple meta tags with the same name was added during 7 LTS development https://docs.typo3.org/typo3cms/extensions/core/7.6/Changelog/7.4/Feature-67360-CustomAttributeNameAndMultipleValuesForMetaTags.html


(Jigal Van Hemert) #5

If I read this correctly it will add knowledge about the existing metatags and what their behaviour is to the core. Removing the “content” metatags removes certain metatags of which we know they belong to this category. Or is the category set by the integrator/programmer when it adds a metag?

Furthermore we cannot be sure that certain existing tags should be removed. It’s perfectly valid to have multiple og:image meta tags. Also, some of these og: tags should be kept together in groups; a og:image:width tag belongs together with the og:image tag earlier.

For some tags the values can be combined. In those cases it’s valid to add the new value to the existing one instead of overwriting it.

Do we want to add this knowledge to the core so we need to update it constantly with new tags that come around? Or will we rely on the integrator/programmer to have this knowledge and use the right extra parameters to set these behaviours?


(Mathias Schreiber) #6

While I agree that there it is technically possible to have multiple og:image tags (to pick up the example) it makes no sense whatsoever to keep these and just add to them.

My reasoning:
Say you add an og:image for the current page.
This image gets pulled in by any og consumer.
Now EXT:news adds its own og:image which belongs to the news post why on earth would anyone add this image to the list of consumable images?
So people on twitter will choose between the generic image (the one from pages) and the one that belongs to the actual context of the current page (being the image of the news entry)?
As a marketer you lose your entire control over your outside communication.

To quote Hunt for Red October here… “Would you launch a nuclear missile horizontally?” - "Sure… but why?"
Just because something is technically possible doesn’t automatically make it viable :slight_smile:
We try to solve integrator problems here and every integrator I talked to has the same issues:
Not being able to programatically access meta tags that have been added somewhere by something.

The goal here is to give an integrator the possibility (which they don’t have by now) to throw away all meta tags of a certain type and gain control over their source again.


(Riccardo De Contardi) #7

I have not understood the difference between “static” and “content”. For “content” you mean a meta tag that can be overwritten by an extension? (like in the example of ext:news detail page?)


(Richard Haeser) #8

In general I think it is a good idea. Although I do have some questions.

About the managers: how do you want to do that and how extendible would you make that? For example Twitter cards is a feature that is used a lot too and also adds some semantical information to the data. Do we need to create a manager for all those kinds of metadata?

I do have the same question as @jigal. How is decided which metatags belong to which “container”? Do you have to set it when adding a metatag? In that case, it is easy devs will mixup those containers, resulting in double metatags as well. Doing it in core will give you a lot of maintenance when new metatags appears.

Aren’t we doing things to hard? And can we just say: by default overwrite the existing metatag, and if you want to add a double metatag, just add a parameter to add it instead of replacing it? Or am I just missing the point?


(Markus Klein) #9

Yes we would need a dedicated “manager” for those too.

Because I was asked about that today at TUGA: These metadata-managers should really act as a central component to output stuff in whatever format. A schema.org manager would maybe also need to be able to output/add JSON-LD data to the page - besides usual microdata, etc. So these should really be thought and designed as “channel-agnostic”.

And yes, whenever the semantics of one of those systems (eg html5) changes, we need to adopt. But hey, that’s what the whole web is about, adoption.


(Richard Haeser) #10

Seems a good reason. Not only doing metatags but also adding meta information on other ways. So managers seems to be a good option for that. I think however that the core should ship the most important ones but that an extension developer should have the possibility to add an own manager for a specific use-case.


(Markus Klein) #11

Yes, those managers should actually only use interfaces to provide the information to the PageRenderer.
I would imagine the PageRenderer to hold a list of all managers and to ask the managers for their data whenever it seems appropriate for the PageRenderer


(system) #12

This topic was automatically closed after 14 days. New replies are no longer allowed.