SQL efficient schedule generation algorithm SQL efficient schedule generation algorithm sql sql

SQL efficient schedule generation algorithm


This answer is just meant as a solution direction for the schedule part, not a 100% nice solution:

What you created, requires loops to be able to satisfy all the conditions.

To get such a case solved quicker, it can be practical to work in vectors instead in which in the vector all positions are represented by 0 (available) and 1 (taken).

So the student/math-1 issue:

Say there are 2 rooms and 3 hours: The math-1 vector per room is then:

Room 1: [0 0 0]Room 2: [0 0 0]

Essentially (I at least) do not care about if a certain room is available as long as 1 is available:So an AND per index could be the answer in this case for availability (remember: 0 is available):

Room 1: [1 0 0] Room 2: [0 0 0] Room result: [1 0 0] AND [0 0 0]=[0 0 0]

So an AND can tell if the first hour is still available.

If you now combine this with a student with the available hours (also just 3 for this example):

Student A: [0 0 1] Room result: [0 0 0] Student matches with room using an OR for this operation: [0 0 1] OR [0 0 0]=[0 0 1]

So the student A would match into room result.

In SQL: The data model (part: Missing is the course match):Table room:

CREATE TABLE room(room_id INT,space TINYINT DEFAULT 0,hour INT DEFAULT 1);CREATE TABLE student(student_id INT,space TINYINT DEFAULT 0,hour INT DEFAULT 1)

All data has been inserted into tables in full: In this case 1 room, 3 hours, 3 places available.

INSERT INTO room VALUES (1,0,1);INSERT INTO room VALUES (1,0,1);INSERT INTO room VALUES (1,0,1);INSERT INTO room VALUES (1,0,2);INSERT INTO room VALUES (1,0,2);INSERT INTO room VALUES (1,0,2);INSERT INTO room VALUES (1,0,3);INSERT INTO room VALUES (1,0,3);INSERT INTO room VALUES (1,0,3);

Student has:

INSERT INTO student VALUES(1,0,1);   INSERT INTO student VALUES(1,0,2);   INSERT INTO student VALUES(1,1,3);   

So the student is available in the first two hours only.

To now get a result from a query:

SELECT room_idFROM room aINNER JOIN student b ON a.space=b.space AND a.hour=b.hour;

This result only has to be split into the groups of maximum of 8, in which it is the end of the SQL part and time for another programming language.

This model can be expanded with a date, however it works best when using just hours and weekdays (weekday availability is again 0 or 1).

As I stated: this is a concept/idea, not a 100% solution, so it needs work before you can use it.....


I believe what you are describing is a version of the constraint satisfaction problem, which is frequently used solve resource allocation issues. It is very likely that the solution is going to be NP-complete, or in other words, the time needed to solve the problem will grow exponentially as the size of the problem (in this case the number of students/classes/rooms) grows. This is one of the classic outstanding problems in computer science. There is no known perfect solution, but that doesn't mean there isn't something that will be useful in your situation. I'll try to describe your problem in a bit more detailed fashion before suggesting a path toward a solution.

Two problems

You have at least two problems that you are trying to solve:

  1. Is it possible to find any combination of students-groups-classes that will fit into the available times-rooms?
  2. From the possible combinations, is one of them more optimal than another? And is it possible to determine which combination is optimal in a reasonable amount of time?

First, it is very likely that there are no possible combinations that would satisfy your constraints. To demonstrate this, imagine that you have only two students and only one classroom that is available for only one hour. If the two students can be put into the same group, then it is possible to schedule them both into the one classroom at the same time. However, if the two students cannot be grouped, e.g. one is "dumb" and one is "clever," then there is no combination of resources that will satisfy your constraints.

While it is easy to determine if a solution exists in a very simple case like I have described, it is very difficult to determine if a solution even exists for an arbitrarily large set of students/classes/rooms.

Setting Reasonable Limits

First of all, it is easy to set an absolute upper bound on the number of students that can be enrolled. The theoretical max enrollment equals

rooms * hours * students/room / hours/student

So for example if you have 100 rooms available for 5 hours each, each room can hold 8 students, and each student needs to study for 5 hours:

100 * 5 * 8 / 5 = 800 students

However, it is extremely unlikely, given a random collection of students of varying grades and ability levels, that you'll ever be able to accommodate this theoretical maximum.

If we come from the other end of the spectrum, assume you have 500 classroom hours (100 rooms * 5 hours), then you know that you can always accommodate at least 100 students (1 student per room * 5 hours). The trick is to figure out a reasonable upper bound, between 100 and 800, that makes this problem solvable in a reasonable amount of time.

In order to make a reasonable guess at what this upper bound should be, it seems prudent to look at the constraints on group formation.

Grouping Constraints

Students are classified in two dimensions:

  1. Ability Level: Dumb, Normal, Clever (D, N, C)
  2. Grade level: 9, 10, 11, 12

Which means you have 12 categories of student: 9D, 9N, 9C, 10D, 10N, 10C, ...

Only some of those categories are compatible with each other for grouping, which gives you a finite number of potential group types. Assuming you only had 12 students, 1 of each of the 12 types, the theoretical maximum number of group types (assuming ANY student type can be paired with any other type), would be 12!/4! = 19,958,400. But given the constraints the actual possible number of group types will be smaller. As it turns out, I think we can safely reduce the group types to four, each of which is made up of some combination of students of types:

  1. 9D, 9N, 10D, 10N
  2. 9N, 9C, 10N, 10C
  3. 11D, 11N, 12D, 12N
  4. 11N, 11C, 12N, 12C

There is some obvious overlap here since "normal" students can belong to more than one group type. But we're finally beginning to get some information that would be useful for group formation:

Start by assigning students in the most restrictive categories to groups first. Then add students in less restrictive groups.

In other words, students in the "dumb" and "clever" categories can only belong to one of the four group types, so they should be assigned first. So the algorithm might look like:

  1. For each course
  2. Select all of the clever/dumb students in grades 9/10 or 11/12
  3. Create as many groups of 8 as you can with students in that category
  4. Fill up the remaining groups with empty slots with "normal" students
  5. Group the remaining "normal" students into groups of 8

This should result in a grouping with the minimum number of groups possible. The problem with this is that it is only one of thousands (maybe millions) of other groupings that is possible. It is highly unlikely that this particular grouping will be the right one. We still can swap students in different groups, but we need a smart way to do this.

Scheduling Constraints

Now that you've got students assigned to groups, you can begin putting the groups in classroom/time slots. The primary constraint here is that you can't schedule two groups into a time slot which would require the student to be in more than one place at the same time.

Let's again start with a simpler example that we can actually envision in our heads. Let's assume there are only four courses, Art, Music, Math and Science, that will be taught in 2 time slots in 4 classrooms. We will have 8 groups of 2 students, noting that each student will be in 2 of the groups since each student is taking two of the available classes. For simplicity, let's assume all of students are in the same category, e.g. 9N, so can be swapped between groups without problem. The students represented by the letters A-H and a group is represented by two letters, e.g. group AB contains students A and B. Let's say the first schedule generated by the system looks like this:

         Art  Music  Math  ScienceTime_1    AB   CD     EF    AHTime_2    CD   EF     GH    GB

Each course is taught twice, and we see that all of the groups are made up of a valid set of students, but we see that students A and G are both double booked: A has two classes at Time_1 and G has two classes at Time_2. The simple thing to do is to swap A and G in their science times:

         Art  Music  Math  ScienceTime_1    AB   CD     EF    GHTime_2    CD   EF     GH    AB

But there are also more complex solutions that involve moving a lot of people around and changing all of the groups, e.g.:

         Art  Music  Math  ScienceTime_1    AC   ED     GF   BHTime_2    BD   FC     HE   AG

Clearly, one of these is more efficient than the other, but there is no easy way for a computer to tell the difference. As humans we can see the solution relatively quickly in this example, but imagine dozens of courses with 8 students in each and you see how quickly this becomes a mess. Obviously we don't want to have to check all of the possible permutations in a brute force to find a solution.

Of course another solution would be just to add more time slots, e.g.:

         Art  Music  Math  ScienceTime_1    AB   CD     EF    GHTime_2    CD   EF     H     BTime_3                G     A

Computationally, this is simpler and faster, but obviously doesn't optimize classroom space and teacher time, and of course it wouldn't be possible if all of the possible time slots already had classes in them.

Being Intelligent About It

Let's step back for a second and think about what we know about the whole system. Here's some things we know:

  1. Similar students are likely to choose similar sets of classes
  2. If you have a set of groups of students who all are taking the same set of classes it is straightforward to schedule them

For example, if we have 4 students (2 groups of 2) who all want to take the same set of classes, it is easy to just fit the groups into a matrix:

         Class_1 Class_2Time_1     AB      CDTime_2     CD      AB

Doing it this way we can be confident in advance that there will be no conflicts. This is straightforward, scales well, and brings us to our second insight.

Start by creating groups of students who are all taking the same set of classes.

Taking this into account, we might change our algorithm above to the following:

  1. For each student in a restrictive category (i.e. dumb/clever)
  2. Loop through all of the other students in that category and create groups from all of the ones who have chosen the same set of classes
  3. If there is a group left with <8 students, fill that one with students in a non-restrictive group (normal) that have chosen that same set of classes
  4. As students are added to groups, remove them from the total pool
  5. Repeat this for all of the students in restrictive categories
  6. Schedule all of these students in a grid matrix fashion
  7. Repeat this for remaining students in the normal category

With any luck, by the time this is done you will have a much smaller set of students who have a more challenging set of scheduling requirements.

What Next?

From here the smartest thing to do would seem to depend on how many students are left in the unscheduled pool. Possibilities include:

  • Repeat the above strategy but grouping students with 4 out 5 classes in common
  • Create a list of courses that need to be created based on the requested courses in the remaining pool, then loop through each of those courses and add as many students as you can to each sequentially starting with students in more restrictive categories and filling in with the normal students
  • If the number is small enough, just manually create any remaining courses

At some point I think you'll find that it's easier just to assign schedules manually to any "oddballs."

You might consider figuring out a way to give a slightly higher weight to grouping students who are in the same grade.

Code examples

Here are some snippets of code that might help. Note, in re-reading your problem I just realized that knowledge_level is assigned on a per-course basis and not to the student overall. I'll try to make adjustments to account for that.

// function to determine whether two students have selected the same classesfunction studentsSelectedSameClasses($s1, $s2) {    // returns true if students selected the same set of classes    // returns false other wise    // this takes into account knowledge_level and will consider    // a class the same if the knowledge_levels are compatible}// create arrays of unscheduled students, i.e. not yet in groups, by grade// 9th/10th and 11th/12th together since they can be in the same classes$unscheduled["9_10"] = Student::find()->whereIn('class', [9,10])->all();$unscheduled["11_G"] = Student::find()->whereIn('class', [11,G])->all();// copy this array into another array from which we'll remove// students as they get put into groups$pool = $unscheduled;// loop through unscheduled; try to match on course selectionsforeach($unscheduled as $grade => $students) {    foreach($students as $i => $student) {        // make sure they are still in the pool, i.e. not already in a group        if(!in_array($student, $pool[$grade]) continue;        // now loop through other students        foreach($pool[$grade] as $j => $potential_classmate) {            if(studentsSelectedSameClasses($student,$potential_classmate)){                // create new groups for each class if necessary                // add both students to the groups if necessary                // remove them from the $pool                // if the group size reaches 8 go on to the next unscheduled            }        }    }}// At this point $pool may not be empty, but you should have a bunch of // easily scheduled groups and a much smaller pool to resolve

Thanks for a fun problem. I enjoyed thinking about it and hope this helps!


Interesting problem and for me I would have a suggestion on an approach al be it my brain does not construct logical problems in math ways but I have been seen to show intelligence beyond hairdressing so here I go.

I can follow the proposed lack of constriction's and it made me think of linear problems / programming that also needs precise constraints to calculate an optimum. But also that we can cut back a matrix calculation in half by first dividing it by 2 and then search a result in the lower or upper half as it can't be in both. But there is nothing to cut in half so I then thought that its more the reasonable to assume that there has to be a real life making sense sort of thing here or it won't work or adds up to quickly is my term for it:D

So I now suggest that there is logic in this approach: courses are linear going from intro course to exam. So no student that has taken the intro course may take it again as it's just stupid (Einstein et al:). So every student that has attended math1 can be excluded for that course. So if we make a gradual approach with math 1 to 5 and the rule that all courses must be attended where a course level may not differ more than -2 from the student level that equals courses attended in a linear fashion for a given student, then all students that have attended course 1 can be excluded for that class. So for math 2 only students with levels, or courses attended, 0 and 1 can attend. For math 3 students with level 1 or 2 can attend. So if we start creating the largest groups of students that can attend any course and start of by making a cut on smart and stupid straight away to save time paring pointless options as a level 4 student can never attend the same math class as a level 0,1 student?

Or something like that. Am working on that graph for this but it looks more like my neighbors' garage at this moment so dont expected much I guess..