Generate DB scheme from TCA

I disagree on that, even if you go as deep as 5NF.

Do you have a wrap-up of what you exactly gonna do?
And of course nobody stops anybody from doing something. :wink:

There is more then SQL today. Document types and language overlays confront us daily with the object-relational impedance mismatch.

Proper normalisation can’t solve it. It’s the main reason, why it’s difficult to reach proper normalisation at all. Normalisation is powerful. With documents as main data it is not the golden hammer, that can ever handle ALL of it.

The question if and how TYPO3 can overcome this limitations will be crucial one day. Yet moving away from SQL to more TCA will not bring solutions. We don’t have a model more sound than SQL at hand.

Please provide a concept first and put it up to discussion. In the past we had the blueprints, but lately many changes went in without a proper discussion (outside of slack) or a written concept. This will also ease reviews later.

1 Like

A TCA do not depend on a database, the data could be anywhere else. Especially after the backend form rewrite it is easier possible to inject data which is stored anywhere and retrieved/saved over which service ever. So I wouldn’t go back and bundle it more with the database. The storage should be independent from the manipulation handling.

1 Like

Do we really have to keep the non-ORM tables?
Wouldn’t it be much straightforward to use models everywhere?

This does not mean you have to use the EntityManager to retrieve data. It would also be possible to retrieve a queryBuilder from the EntityManager, thus we are able to handle the current statements and do not have to rely on the ORM-Mapping etc.

Using ORM:

PRO

CON

  • Datahandler has to be rewritten at some places
  • ORM Mapping for Extbase has to be rewritten

If I missed some drawbacks here, please add them.

1 Like

Well I’ve already started with implementation. This is not fixed. But it works :slight_smile:

Forge: https://forge.typo3.org/issues/81234
Review: https://review.typo3.org/#/c/52808/

1 Like

Hey there, my 2 cents on this topic:

TLDR: Yes, let’s create schema automatically from TCA. Armins patch that takes care of ‘ctrl’ section is a great start, for ‘columns’ fields we’ll have to sort out syntax, though.

In general, TCA is the main array in TYPO3. We’ve been embracing this one over the past years already: It is relatively well encapsulated and it is almost always loaded in basically all situations. Soon, it will be possible to warm it up, too.

This array is the main glue it TYPO3: It contains field definitions, it steers formEngine, it steers the DataHandler, extbase mapping is based on it and extbase persistence should be based on it properly, it leads most of the main views of the backend.

In standard extension scenarios, field definitions are written at three times: ext_tables.sql, TCA, extbase models. That’s really annoying for extension developers. I think further pushing TCA can solve that. It is our central configuration solution.

Thus, I talked with Armin a couple of weeks ago about that: It is pretty easy to get rid of ext_tables.sql for tables based on TCA.
The most low-hanging fruit in here is automatic creation of fields steered from within the ‘ctrl’ section: All those nasty deleted, hidden and workspace fields. We have properties for all those in TCA already. They are always the same and can be created without any TCA syntax change. As you can see in the patch https://review.typo3.org/#/c/52808/ hundreds of redundant lines can be kicked from ext_tables.sql this way. I think that’s really a nice start.

For single fields and keys, we’ll have to add some additional information here and there in TCA. While an type=input, eval=int could be made a default integer field already, we may need the one or the other idea on how to override default field length and stuff like that. I’m not yet entirely sure on how this will look like, though.

Well, DB defaults should apply unless the user actively defines a different length in 'database' => ['length' => xxx] (my suggestion).

For columns that are not in the DB, we could create a new ctrl key ‘virtualFields’ that are not processed by scheme creation. This might even be a useful information for datahandler, such that it can leave those fields untouched.

I am not sure if it is wise to process the eval condition to deduce the field type. I would prefer I we could keep display properties and DB properties separated as much as possible to have a clear definition of responsibility.

You mean virtualTables right? Because virtual fields are already existing, aka type=“passthrough”.

Good question how to handle MM-Tables and so many tables which are not in TCA yet. Can we call such tables “shadow tables” please? :smiley:

In addition to my first approach, I would like to add more rules. For each type (input/text/select/…) one. It should auto-guess the best column definition, based on TCA. Like you already mentioned, with setting the the length of varchar, if a field just allows n chars, etc.

But this magic must be overwritable. So a global option for any defined column in TCA, which sets the sql definition, could be a solution for this.

The only additional option in ctrl part should be the management of creating indexes on one or more columns.

Good point with the passthrough fields, afaik they still need to exist in the database. If needed, we could introduce a new type virtual. Until now, people seemed to have done quite well without it.

Regarding mm tables, they are in the tca at least at one side of the relation. Therefore we should be able to deduce them including all required fields in the relation.

One more note, please split this up into multiple patches. This makes reviewing a lot easier.

MM tables should be able to get generated automatically, by MM settings in group/inline configuration. For mm tables with additional attributes, like sys_file_reference, you can just define proper TCA, like sys_file_reference.

For what do you need a new type “virtual”? sys_file_reference TCA for example sets rootLevel in ctrl part to -1 and the whole table becomes invisible in backend, but data is accessible. And field type “passthrough” makes single fields invisible. If this is what you’ve meant with “virtual”, I could missunderstand you.

Based on my first start, these are the next questions/tasks, I guess:

  • Write rules to generate sql schema for all column types
    • also they must generate mm tables
  • Add new TCA option (API related) for all column types, which allow to overwrite the schema-auto-guess, performed in rules (or probably the TcaSqlBuilder is a better, more central place). My suggestion for a name: schema or sqlSchema. I’ve also added an example below
  • Write rule to generate sql indexes
  • Define new TCA option (API related) which controls the indexes. Maybe something like the example below.
  • Create TCA for missing sys tables (rootLevel: -1 and columns are passthroughs with hardcoded sqlSchema)
  • Translate all ext_tables.sql files to sqlSchema (if necessary, ideally auto-guess-schema works for most columns)
  • Remove all ext_tables.sql files
  • Documentation (for extension developers)

Example of possible new option, to set hardcoded sql schema (and overwrite auto-guess):

<?php
$GLOBALS['TCA']['pages']['columns']['backend_layout_next_level']['config']['sqlSchema']
= 'varchar(64) DEFAULT \'\' NOT NULL';

Example of possible new option, to control sql indexes in TCA:

<?php
$GLOBALS['TCA']['pages']['ctrl']['sqlIndexes'] = [
  't3ver_oid' => 't3ver_oid,t3ver_wsid',
  'parent' => 'pid,deleted,sorting',
  // ...
];

Andreas Fernandez asked a good question: Values from ext_tables.sql are completely ignored, when the column is defined by rules. I mean, the main goal is to get rid of the whole file… But until then… Maybe we need a new Slot, used before extensions’ sql files are loaded?

You got it backwards. The field type virtual would be for fields in form engine that do not exists in the database. The question would be if we need it, because people have been doing this without it just fine.

I like it.

Please please make sure that the column definition for columns pointing to other records are always the same. Otherwise it is impossible to introduce proper foreign key constrains.

I am not an expert; generally speaking I would be in favor of steps that go in the direction of “hiding” the database itself to the integrator (and also the developer). I don’t know if I made myself clear

We have a functioning doctrine migration POC in out internal git, no automated migration generation yet, but all the execution of migrations … it is also not that complicated to have automated migrations

1 Like

Just my 2 cents:

TCA is loaded for each request. Either FE or BE. SQL information is needed only once in a while (on DB comparison). I would prefer not to blow up TCA information. There are even fields in the database which are not part of TCA yet and might be added (and TCA is blown even more).

3 Likes

The way I see this is, that I would love to see ext_tables.sql die out. In contrast to many other opinions, I do not see very many ext_tables.sql files that add a special definition for tables and fields, that is so much different from the TCA field definition. Often times what I see is, that someone defines a field type in the SQL file, that is even wrong in regards to the TCA definition. I think for most extension developers, the SQL definition is just an unwanted part of creating and maintaining an extension, that takes up time unnecessarily, since one has to remember to change field type definitions in TCA and the corresponding SQL definition.
My personal nightmare was an extension that has been created using the kickstarter, but the maintainer only updated the TCA definition file, not the ext_tables.sql. He just added the database fields manually to the database, this caused mayhem when we tried to migrate the system onto a new TYPO3 version, with the database schema tool suggesting 90% of tables and fields to be deleted. I think such cases could be avoided by having the TCA definition being source of both, the backend view and the database definition, since one has to define a field to actually see it, conveniently triggering the generation of the field in the database as well.

So I would love to have just one definition source, the TCA.

I do not think this is a good idea. This makes it easy to start, but (usually) results in long term technical debt. SQL should be a first class citizen for each application developer that handles data models (unless you go no-sql).

2 Likes

If fields exist for technical reason only, they should only be hinted as pass-through in TCA. But what do fields do in the DB if TYPO3 should not know about them (not in TCA)? Is this external data?

Anyway DB scheme knowledge is required by datahandler, e.g types, relations. So the only overhead are keys and maybe size information, the later which might come handy for datahandler too.

I thought of fields like uid, pid, workspace related fields, which exists in DB but not in TCA - TYPO3 related fields, currently not defined in TCA - no external data.

As far as I know the TYPO3 core neither DataHandler nor FormEngine care about any SQL related information. Both do their jobs on TCA information only. Currently types and relations don’t contain any SQL information. Abstraction for table names / fields are stored as plain text information only. And IMHO is will still be needed even without ext_tables.sql file.

Good point. But do you have a better idea, where to put these informations to? And please don’t say: “ext_tables.sql” :wink:

When I look at size of e.g. FAL field configuration, I can’t imagine that the mentioned options (“sqlSchema” in columns and “indexes” in ctrl part) are blowing TCA up that much.