MySQL의 ORDER BY RAND () 함수를 어떻게 최적화 할 수 있습니까?
내 쿼리를 최적화하고 싶습니다 mysql-slow.log
.
내 느린 쿼리의 대부분에는 ORDER BY RAND()
. 이 문제를 해결할 실제 해결책을 찾을 수 없습니다. MySQLPerformanceBlog에 가능한 해결책이 있지만 이것만으로는 충분하지 않다고 생각합니다. 제대로 최적화되지 않은 (또는 자주 업데이트되는 사용자 관리) 테이블에서는 작동하지 않거나 PHP
생성 된 임의 행을 선택하기 전에 두 개 이상의 쿼리를 실행해야합니다 .
이 문제에 대한 해결책이 있습니까?
더미 예 :
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
ORDER BY
RAND()
LIMIT 1
이 시도:
SELECT *
FROM (
SELECT @cnt := COUNT(*) + 1,
@lim := 10
FROM t_random
) vars
STRAIGHT_JOIN
(
SELECT r.*,
@lim := @lim - 1
FROM t_random r
WHERE (@cnt := @cnt - 1)
AND RAND(20090301) < @lim / @cnt
) i
이에 특히 효율적입니다 MyISAM
합니다 (이후 COUNT(*)
순간입니다)하지만, 심지어에서 InnoDB
그것의 10
보다 효율적인 시간 ORDER BY RAND()
.
여기서 주요 아이디어는 정렬하지 않고 대신 두 개의 변수를 유지 running probability
하고 현재 단계에서 선택할 행을 계산한다는 것 입니다.
자세한 내용은 내 블로그에서이 기사를 참조하십시오.
최신 정보:
하나의 무작위 레코드 만 선택해야하는 경우 다음을 시도하십시오.
SELECT aco.*
FROM (
SELECT minid + FLOOR((maxid - minid) * RAND()) AS randid
FROM (
SELECT MAX(ac_id) AS maxid, MIN(ac_id) AS minid
FROM accomodation
) q
) q2
JOIN accomodation aco
ON aco.ac_id =
COALESCE
(
(
SELECT accomodation.ac_id
FROM accomodation
WHERE ac_id > randid
AND ac_status != 'draft'
AND ac_images != 'b:0;'
AND NOT EXISTS
(
SELECT NULL
FROM accomodation_category
WHERE acat_id = ac_category
AND acat_slug = 'vendeglatohely'
)
ORDER BY
ac_id
LIMIT 1
),
(
SELECT accomodation.ac_id
FROM accomodation
WHERE ac_status != 'draft'
AND ac_images != 'b:0;'
AND NOT EXISTS
(
SELECT NULL
FROM accomodation_category
WHERE acat_id = ac_category
AND acat_slug = 'vendeglatohely'
)
ORDER BY
ac_id
LIMIT 1
)
)
이것은 당신이 ac_id
어느 정도 균등하게 분포되어 있다고 가정합니다 .
그것은 당신이 얼마나 무작위 적이어야 하는가에 달려 있습니다. 연결 한 솔루션은 IMO에서 잘 작동합니다. ID 필드에 큰 간격이 없으면 여전히 무작위입니다.
그러나 다음을 사용하여 하나의 쿼리에서 수행 할 수 있어야합니다 (단일 값 선택 용).
SELECT [fields] FROM [table] WHERE id >= FLOOR(RAND()*MAX(id)) LIMIT 1
기타 솔루션 :
random
테이블에 호출 된 영구 부동 필드를 추가하고 난수로 채 웁니다. 그런 다음 PHP에서 임의의 숫자를 생성하고"SELECT ... WHERE rnd > $random"
- Grab the entire list of IDs and cache them in a text file. Read the file and pick a random ID from it.
- Cache the results of the query as HTML and keep it for a few hours.
Here's how I'd do it:
SET @r := (SELECT ROUND(RAND() * (SELECT COUNT(*)
FROM accomodation a
JOIN accomodation_category c
ON (a.ac_category = c.acat_id)
WHERE a.ac_status != 'draft'
AND c.acat_slug != 'vendeglatohely'
AND a.ac_images != 'b:0;';
SET @sql := CONCAT('
SELECT a.ac_id,
a.ac_status,
a.ac_name,
a.ac_status,
a.ac_images
FROM accomodation a
JOIN accomodation_category c
ON (a.ac_category = c.acat_id)
WHERE a.ac_status != ''draft''
AND c.acat_slug != ''vendeglatohely''
AND a.ac_images != ''b:0;''
LIMIT ', @r, ', 1');
PREPARE stmt1 FROM @sql;
EXECUTE stmt1;
This will give you single sub query that will use the index to get a random id then the other query will fire getting your joined table.
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
AND accomodation.ac_id IS IN (
SELECT accomodation.ac_id FROM accomodation ORDER BY RAND() LIMIT 1
)
The solution for your dummy-example would be:
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation,
JOIN
accomodation_category
ON accomodation.ac_category = accomodation_category.acat_id
JOIN
(
SELECT CEIL(RAND()*(SELECT MAX(ac_id) FROM accomodation)) AS ac_id
) AS Choices
USING (ac_id)
WHERE accomodation.ac_id >= Choices.ac_id
AND accomodation.ac_status != 'draft'
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
LIMIT 1
To read more about alternatives to ORDER BY RAND()
, you should read this article.
I am optimizing a lot of existing queries in my project. Quassnoi's solution has helped me speed up the queries a lot! However, I find it hard to incorporate the said solution in all queries, especially for complicated queries involving many subqueries on multiple large tables.
So I am using a less optimized solution. Fundamentally it works the same way as Quassnoi's solution.
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
AND rand() <= $size * $factor / [accomodation_table_row_count]
LIMIT $size
$size * $factor / [accomodation_table_row_count]
works out the probability of picking a random row. The rand() will generate a random number. The row will be selected if rand() is smaller or equals to the probability. This effectively performs a random selection to limit the table size. Since there is a chance it will return less than the defined limit count, we need to increase probability to ensure we are selecting enough rows. Hence we multiply $size by a $factor (I usually set $factor = 2, works in most cases). Finally we do the limit $size
The problem now is working out the accomodation_table_row_count. If we know the table size, we COULD hard code the table size. This would run the fastest, but obviously this is not ideal. If you are using Myisam, getting table count is very efficient. Since I am using innodb, I am just doing a simple count+selection. In your case, it would look like this:
SELECT accomodation.ac_id,
accomodation.ac_status,
accomodation.ac_name,
accomodation.ac_status,
accomodation.ac_images
FROM accomodation, accomodation_category
WHERE accomodation.ac_status != 'draft'
AND accomodation.ac_category = accomodation_category.acat_id
AND accomodation_category.acat_slug != 'vendeglatohely'
AND ac_images != 'b:0;'
AND rand() <= $size * $factor / (select (SELECT count(*) FROM `accomodation`) * (SELECT count(*) FROM `accomodation_category`))
LIMIT $size
The tricky part is working out the right probability. As you can see the following code actually only calculates the rough temp table size (In fact, too rough!): (select (SELECT count(*) FROM accomodation) * (SELECT count(*) FROM accomodation_category))
But you can refine this logic to give a closer table size approximation. Note that it is better to OVER-select than to under-select rows. i.e. if the probability is set too low, you risk not selecting enough rows.
This solution runs slower than Quassnoi's solution since we need to recalculate the table size. However, I find this coding a lot more manageable. This is a trade off between accuracy + performance vs coding complexity. Having said that, on large tables this is still by far faster than Order by Rand().
Note: If the query logic permits, perform the random selection as early as possible before any join operations.
(Yeah, I will get dinged for not having enough meat here, but can't you be a vegan for one day?)
Case: Consecutive AUTO_INCREMENT without gaps, 1 row returned
Case: Consecutive AUTO_INCREMENT without gaps, 10 rows
Case: AUTO_INCREMENT with gaps, 1 row returned
Case: Extra FLOAT column for randomizing
Case: UUID or MD5 column
Those 5 cases can be made very efficient for large tables. See my blog for the details.
function getRandomRow(){
$id = rand(0,NUM_OF_ROWS_OR_CLOSE_TO_IT);
$res = getRowById($id);
if(!empty($res))
return $res;
return getRandomRow();
}
//rowid is a key on table
function getRowById($rowid=false){
return db select from table where rowid = $rowid;
}
참고URL : https://stackoverflow.com/questions/1244555/how-can-i-optimize-mysqls-order-by-rand-function
'programing' 카테고리의 다른 글
Java 프로젝트에서 패키지 이름 지정에 어떤 전략을 사용하며 그 이유는 무엇입니까? (0) | 2020.09.10 |
---|---|
매개 변수와 함께 함수에 대한 참조를 어떻게 전달할 수 있습니까? (0) | 2020.09.10 |
비 정적 데이터 멤버 및 중첩 클래스 생성자의 클래스 내 초기화를 사용할 때 오류 발생 (0) | 2020.09.10 |
RESTful 웹 서비스를 보호하는 방법은 무엇입니까? (0) | 2020.09.10 |
Spring Boot 시작 시간 단축 (0) | 2020.09.10 |