I have been searching for ways to do this but I was not able to find any.

What I want to do is this:


movie_genre (Table)
movidId movieTitle movieGenre (Varchar)
1 Copying Beethoven (2006) Biography
2 Copying Beethoven (2006) Drama
3 Copying Beethoven (2006) Music

movie_plot (Table)
movieId movieTitle moviePlot
1 Copying Beethoven (2006) A fictionalized account of the last year of...

I would like to merge them into a single table:

movie_title (PK)
movie_release_year (YEAR)
movie_plot (TEXT)
movie_genre
(SET('Action','Adventure','Adult','Animation','Comedy','Crime','Documentary','Drama','Fantasy','Family','Film-Noir','Horror','Musical','Mystery','Romance','Sci-Fi','Short','Thriller','War','Western'))
movie_average_rating (FLOAT)

I am temporarily using the following statement:

SELECT (SELECT DISTINCT movie_genre.movieTitle), movie_genre.movieReleaseYear, movie_genre.movieGenre, movie_plot.plot FROM movie_genre LEFT OUTER JOIN movie_plot ON movie_genre.movieTitle = movie_plot.movie_title WHERE movieTitle = '"Weeds" (2005)';

I am using MySQL & InnoDB. Executing the query above gives me redundant results (3 rows with same title, release year and plot entity but different genres). I would like to merge them into a single table. If this is not possible, is there a way to merge rows in (movie_genre) table?

movie_genre (Table)
movidId movieTitle movieGenre (Varchar)
1 Copying Beethoven (2006) Biography
2 Copying Beethoven (2006) Drama
3 Copying Beethoven (2006) Music

into

movie_genre (Table)
movidId movieTitle movieGenre (Varchar)
1 Copying Beethoven (2006) Biography, Drama, Music

First of all make sure that your set definition is complete. As of now, it lacks "Biography" and "Music".

Afterwards go along those lines:

drop table if exists movie_genre;
create table movie_genre
(id integer
,title text
,genre varchar(255)
);
drop table if exists movie_plot;
create table movie_plot
(id integer
,title text
,plot text
);
insert into movie_genre values
(1, 'Copying Beethoven (2006)', 'Biography' ),
(2, 'Copying Beethoven (2006)', 'Drama' ),
(3, 'Copying Beethoven (2006)', 'Music' )
;
insert into movie_plot values 
(1, 'Copying Beethoven (2006)', 'A fictionalized account of the last year...')
;
drop table if exists movies_merged;
create table movies_merged 
(id integer
,title text
,plot text
,genre set ('Action','Adventure','Adult','Animation','Biography','Comedy','Crime','Documentary','Drama','Fantasy','Family','Film-Noir','Horror','Music','Musical','Mystery','Romance','Sci-Fi','Short','Thriller','War','Western')
,rating float
);
insert into movies_merged (id, title, plot,genre)
select a.id, a.title, b.plot, group_concat(a.genre)
from movie_genre a, movie_plot b
where a.title=b.title
group by a.title
;

Beware: MySQL will pick any (presumably the first) ID from the group, as this field is not processed by an aggregate function.

Doing that gives me "MySQL server has gone away".

There are 100,0000 rows in movie_genre and roughly 600,000 rows in movie_plot.

Some of the movies do not have plot summary also.

Will changing "DBMS connection keep-alive interval (in seconds) solve the problem?

Try on a smaller sample first.
Set an index on all relevant fields.
Which interface are you using? I would recommend to start the process from a mysql console on the database server. Be prepared to wait quite some time until it finishes (if ever). You can look at what the server is doing with "show status" in a second mysql console.

INSERT INTO movie (movie_title, movie_release_year, movie_plot, movie_genre, movie_average_rating)
SELECT a.movieTitle, a.movieReleaseYear, b.plot, GROUP_CONCAT(a.movieGenre)
FROM movie_genre a, movie_plot b
WHERE a.movieTitle = b.movie_title
GROUP BY a.movieTitle;

Works fine with a small set. However, the problem is some of the movies in movie_genre do not have plot summary

For example

movie_genre
A Flashlight Flivver (1915) Comedy
A Flashlight Flivver (1915) Short

movie_plot
No entry found

The query only merges rows that have plot summary. Is there a way that I merge movies that has no plot?

Do it in two steps. First without the plot, then update the plot field with the plot field content from the old table.
This assumes, that all non-null plot fields have identical content. If they have not, select the longest field content.

insert into movies_merged (id, title, genre)
select a.id, a.title, group_concat(a.genre)
from movie_genre a
group by a.title
;
update movies_merged a set plot = 
(select plot from movie_plot b where a.title=b.title and plot is not null)
;

This should also be much more performant than the single query.

It works. Thank you for your kind reply

Now I have to study what GROUP_CONCAT is about. Thank you!

I wonder why I am getting this error

Error code: 1265
Data truncated for column 'movie_genre' at row 88640

Following is the sql statement:

INSERT INTO movie (movie_title, movie_release_year, movie_genre)
SELECT a.movieTitle, a.movieReleaseYear, GROUP_CONCAT(a.movieGenre)
FROM movie_genre a
GROUP BY a.movieTitle;

And data that row 88640 contains:
movieId movieTitle movieReleaseYear movieGenre
88640 Azap(1973) 1973 Romance

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.